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CHAPTER  1 


INTRODUCTION  &  SUMMARY 

MAC  is  a  relatively  new  digital  control  design  technique  that  can  be 
implemented  using  dedicated  microcomputers  or  microprocessors.  In  its 
simplest  form,  MAC  consists  of: 

(i)  an  internal  model  of  the  system  to  be  controlled 

(ii)  a  reference  trajectory  description  of  the  desired  closed 
loop  behavior 

(iii)  an  on-line  optimization  of  future  control  inputs  to  produce 
the  desired  performances. 

This  technique  has  been  proven  successful  in  many  industries  and  aerospace 
applications.  Although  the  methodology  was  originally  developed  by  prac¬ 
ticing  engineers  from  heuristic  arguments,  single-input  single-output  MAC 
under  some  reasonable  assumptions  has  been  extensively  analyzed  in  the  pre¬ 
vious  report  AFWAL-TR-80- 3 125 •  As  a  result  of  basic  research  questions 
arising  in  this  previous  study,  the  present  work  on  adaptive  MAC  was  undertaken. 

The  main  objective  of  this  project  is  to  develop  an  adaptive  MAC  and  an 
appropriate  framework  for  robustness  analysis  particularly  when  the  plant 
is  compensated  apriori  by  a  fixed  gain  analog  controller.  Based  on  the 
objective  of  this  project,  this  report  is  primarily  divided  into  three 
parts:  an  adaptive  estimation  scheme  for  system  identification  of  the 

unknown  plant  dynamics  is  developed  and  analyzed  in  Part  1;  classical  and 
modern  robustness  analysis  techniques  are  applied  to  MAC  in  Part  2;  and  Part 
3  contains  the  results  on  simulation. 

The  methods  of  Parts  1  and  2  are  demonstrated  on  several  examples  by 
computer  simulation  in  Part  3.  Detailed  derivations  and  proofs  of  a  number 
of  the  results  are  contained  in  the  Appendices  in  the  form  of  published 
research  papers  or  papers  being  submitted  for  publication. 
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In  Chapter  2,  the  system  identification  procedure  for  adaptation 
to  system  changes  is  presented.  The  method  used  for  identification  is  the 
canonical  variate  analysis  (CVA)  technique.  This  method  has  been  developed 
in  the  last  several  years  and  overcomes  the  difficult  problems  in  currently 
available  methods  which  prevent  their  use  in  general  real-time  automated 
systems.  Some  of  the  difficulties  of  other  methods  are  first  discussed, 
and  the  attractive  features  of  CVA  are  described  including  the  statistical 
and  computational  robustness  of  the  method  as  well  as  the  inherent 
ability  to  determine  the  appropriate  model  state  order  from  the  obser¬ 
vational  data.  The  basic  conceptual  aspects  of  CVA  are  then  developed 
which  include  the  choice  of  a  best  set  of  reduced  states  of  the  past 
for  prediction  of  the  future  evolution  of  the  process.  This  is 
accomplished  by  a  canonical  variate  analysis  of  the  past  and  future. 

The  details  of  such  an  analysis  are  given  in  two  of  the  appendices. 

The  computational  aspects  of  the  procedure  involve  a  singular  value 
decomposition  which  is  a  very  accurate  and  numerically  stable 
algorithm.  The  close  relationship  between  the  CVA  method  and  the 
maximum  likelihood  and  instrumental  variable  methods  are  described. 

To  investigate  the  effect  of  external  input  excitations  on  the 
accuracy  of  the  identified  system  model,  simultaneous  confidence  bands 
on  the  identified  plant  transfer  function  and  disturbance  noise  power 
spectrum  are  computed.  The  details  of  this  computation  are  contained 
in  an  appendix.  Using  these  results  the  output  tracking  error  due  to 
both  control  and  identification  errors  is  derived  in  the  context  of 
stochastic  and  dual  control.  The  computational  aspects  of  the 
algorithms  are  described  including  the  basic  steps  and  amount  of  com¬ 
putation  with  the  detailed  computational  equations  contained  in  the 
appendices. 

Chapter  3  analyzes  MAC  when  applied  to  a  lightly  damped  plant 
that  has  been  compensated  apriori  by  constant  gain  output  feedback.  MAC 
software  uses  an  impulse  response  description  of  the  plant  which  has  a 
large  number  of  terms  and  is  not  suitable  for  analytical  studies. 

Therefore  in  this  chapter  MAC  has  been  described  using  a  rational  transfer 
function  model  (difference  equation  model)  of  the  plant  which  shows  that 
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one-step-ahead  MAC  can  also  be  explained  using  the  classical  root  locus 
technique.  In  chapter  4  an  appropriate  framework  is  developed  for  robust¬ 
ness  analysis  applying  the  perturbat ional  argument  to  the  Nyquist  plot  of 
the  steady  state  MAC  loop  transfer  function.  It  has  been  possible  to  apply 
the  current  robustness  analysis  technique  to  MAC  under  this  framework.  The 
analysis  gives  a  set  of  sufficient  conditions,  and  the  perturbed  closed- 
loop  system  remains  stable  if  the  additive  or  multiplicative  modelling 
error  of  the  plant  satisfies  these  conditions.  These  conditions  define  the 
neighborhood  of  the  identified  model  such  that  if  the  actual  plant  lies  in 
this  neighborhood  then  the  MAC  control  law  designed  on  the  basis  of  the 
identified  model  also  stabilizes  the  actual  plant.  Finally,  in  Chapter  5, 
new  techniques  are  developed  for  selecting  optimum  (possibly  unique) 
sampling  rates,  which  play  a  crucial  role  in  an  adaptive  control  scheme. 

The  sampling  time  interval  is  selected  on  the  basis  of  a  minimax  approach 
and  also  satisfies  the  classical  Nyquist  sampling  rate. 

Finally,  in  Chapter  6,  extensive  simulation  results  have  been  presented 
and  in  Chapter  7  conclusions  and  summary  are  provided. 

The  major  conclusion  of  this  report  is  that  MAC  is  a  very  effective  and 
superior  control  technique  for  linear  multivariable  plants  in  a  deter¬ 
ministic  environment  as  well  as  in  an  uncertain  environment  where  the  plant 
is  not  exactly  known.  The  adaptive  MAC  has  also  been  found  to  be  success¬ 
ful  where  the  plant  is  slowly  time  varying  and/or  non-linear.  The  robust¬ 
ness  properties  of  standard  MAC  and  adaptive  MAC  have  been  verified  by 
extensive  simulations  of  the  missile  attitude  control  problem.  A  complete 
model  of  MAC  for  a  multi-step-ahead  optimization  horizon  and  input-blocking 
is  not  yet  available,  and  without  this  the  theoretical  properties  of  a  real 
world  MAC  are  not  available  in  an  analytical  form.  It  is  recommended  that 
future  studies  of  MAC  concentrate  on  (i)  developing  a  complete  model  of  the 
MAC  algorithm,  (ii)  comparison  of  MAC  performances  with  other  control  design 
techniques,  and  (iii)  applying  an  adaptive  MAC  to  a  full  scale  flight 
control  problem. 
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Part  2 


CHAPTER  2:  SYSTEM  IDENTIFICATION 


2.1  Introduction 

There  has  been  considerable  progress  in  system  identification  in  recent 

years.  The  method  of  maximum  likelihood  has  been  established  as  the  most 
accurate  in  theory,  although  the  computational  burden  and  numerical  con¬ 
ditioning  are  serious  problems  particularly  for  general  applications  where 
the  number  of  parameters  can  easily  be  dozens  of  even  hundreds.  A  number 
of  simplified  schemes  have  been  considered  such  as  recursive  ML  and  instru¬ 
mental  variable  methods.  While  these  methods  have  reduced  computational 
requirements,  there  are  difficulties  with  initialization  and  with  accuracy 
in  small  samples  which  are  of  particular  interest  in  tracking  dynamical 
systems.  Also  these  methods  are  not  entirely  reliable  numerically  since 
they  depend  upon  the  ARMA  parameterization  which  is  known  to  have  global 
singularities  (Gevers  and  Wertz,  1984).  Also  if  the  system  order  is  over 
estimated,  then  the  computations  become  ill-conditioned.  This  considerably 
complicates  the  task  of  determining  the  state  order  which  is  usually 
unknown.  A  number  of  more  ad  hoc  schemes  are  available,  but  these  have 
even  less  desirable  statistical  or  computational  properties. 

Fortunately,  in  the  last  several  years,  a  new  method  has  been  developed 
using  the  approaches  of  canonical  variate  analysis  (CVA)  method  of  mathema¬ 
tical  statistics,  stochastic  realization  concepts  from  system  theory,  and 
information  or  entropy  methods  for  the  statistical  choice  of  model  order 
and  structure.  This  method  has  some  highly  desirable  properties.  The 
order  of  the  state  is  determined  statistically.  The  computation  is  based 
upon  a  singular  value  decomposition  which  is  one  of  the  most  stable  and 
accurate  numerical  procedures  available.  The  model  fitting  and  state  order 
selection  is  always  numerically  well  conditioned.  The  model  fitting 
accuracy  has  been  found  to  be  very  close  to  maximum  likelihood  in  moderate 
and  large  samples  sizes.  The  canonical  variate  analysis  method  for 
system  identification  has  been  used  as  the  primary  procedure  in  this  study 
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because  it  is  the  only  method  currently  available  with  the  above  proper¬ 
ties.  Furthermore,  it  handles  with  no  additional  complication  the  dif¬ 
ficult  multi-input  multi-output  system  identification  problem.  In  the 
development,  the  CVA  method  is  discussed  in  Section  2.2,  and  the  close 
relationship  of  CVA  to  the  instrumental  variable  and  maximum  likelihood 
methods  are  discussed  in  Sections  2.3  and  2.4  respectively.  The  topics  of 
input  design  and  sampling  for  identif iability  are  described  in  Section  2.5, 
while  the  approaches  of  stochastic  and  dual  control  for  input  design  are 
discussed  in  Section  2.6.  Finally  the  computational  aspects  of  the  CVA 
method  are  discussed  in  Section  2.7.  The  detailed  derivations  supporting 
these  sections  are  contained  the  various  appendices. 


2.2  Canonical  Variate  Analysis  of  Time  Series 

The  canonical  variate  analysis  method  of  system  identification  was  first 
proposed  by  Akaike  (1975).  In  this  fundamental  contribution,  a  stochastic 
realization  algorithm  was  proposed  by  using  the  statistical  method  of  cano¬ 
nical  correlation  analysis  on  the  Hankel  covariance  matrix  to  choose  a 
basis  for  the  state  space  and  to  statistically  determine  the  rank  of  the 
state  space.  This  provided  a  fundamentally  new  and  statistical  approach  to 
the  determination  of  a  dynamical  system  on  the  basis  of  noisy  and  finite 
length  data.  The  statistical  determination  of  state  order  was  based  upon 
the  Akaike  information  criterion  (AIC).  This  initial  work  did  not  consider 
the  case  of  an  input  to  the  system,  but  considered  only  the  case  of  an  out¬ 
put. 


Later  work  (Larimore,  1983,  in  Appendix  B)  includes  the  more  general 
case  of  a  multi-input  multi-output  system.  The  computational  procedure  of 
this  method  is  more  efficient  in  requiring  only  one  canonical  correlation 
analysis,  and  can  also  be  used  to  solve  the  reduced  order  modeling  problem 
using  a  general  quadratic  weighting  on  the  prediction  error  of  the  future. 
Furthermore,  a  more  exact  computation  of  the  AIC  is  used  for  order  deter¬ 
mination  than  that  used  in  the  original  work  of  Akaike. 

The  approach  to  system  identification  using  generalized  canonical 
variables  is  described  in  some  detail  in  Larimore  (1983,  in  Appendix  B). 
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That  approach  involves  consideration  of  the  past  p(t)  and  future  f(t)  of  a 
vector  process  at  a  time  t  defined  as 


PT(  t  )=*( yT(  t)  ,uT(t)  ,yT(  t-l)  ,uT( t-1 ) , . . .  )T 


(2.1) 


fT(t)=(yT(t),yT(t-l),...)T 


(2.2) 


where  u(t)  is  the  input  and  y(t)  is  the  output  of  an  unknown  system  with 
state  space  structure  of  the  form 

x(t+l)=«x(t)+Gu(t)+w(t)  (2.3) 

y( t )=Hx( t )+Au( t )+Bw( t )+v( t )  (2.4) 


with  v(t)  a  measurement  noise  and  w(t)  a  process  noise  with  respective 
cross  spectral  density  matrices  R  and  Q.  From  the  theory  of  Markov 
processes  and  in  particular  the  theory  of  stochastic  realization,  the 
minimal  state  vector  defines  the  information  from  the  past  relevant  to 
the  future  of  the  process  and  is  called  the  predictor  space  (Akaike,  1974a). 

The  approach  of  canonical  variables  to  system  identification  is  to 
determine  the  optimal  set  of  linear  combinations  m(t)  of  the  past  p(t) 
that  best  predict  the  future  f(t)  in  terms  of  minimizing  the  prediction  error 


E  |  |  f-f  |  |  =E [ ( f-f )T  Cov"1  (f  ,f)  (f-f ) ] 


(2.5) 


where  Cov(f,f)  is  the  covariance  matrix  of  the  future  f  and  f  is  the  best 
prediction  of  f  based  upon  the  memory  m(t)  .  This  optimization  problem 
involves  the  optimal  selection  of  the  dimension  of  m(t)  as  well  as 
the  optimal  selection  of  the  linear  combinations  of  the  past. 

The  solution  to  this  problem  is  derived  in  Larimore  (1985a),  included  in 
appendix  A,  in  terms  of  a  generalized  singular  value  decomposition  (SVD). 
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This  solution  is  precisely  a  generalization  of  the  classical  canonical 
correlation  analysis  problem  of  mathematical  statistics  (Hotelling,  1936). 
Modern  computational  procedures  use  a  singular  value  decompositions  (Golub, 
1969)  involving  the  covariance  matricies  of  the  past  and  future.  The 
generalized  SVD  determines  transformations  J  and  L  and  a  diagonal  matrix  D 
such  that 


JCov(p,f)L  =  Diag(y1>...  >Yj>  0,...,0)=D  (2.6) 

JCov(p , p) J  =  I;  LCov(f,f)L  =  I  (2.7) 

The  transformations  can  be  interpreted  as  defining  a  new  set  of  coor¬ 
dinates  for  the  past  and  future  in  which  the  covariance  are  D,  I  and  I  as 
given  in  the  last  equation.  If  in  (2.5)  and  (2.7),  the  covariance  matrix 
Cov(f,f)  is  replaced  by  an  arbitrary  positive  semidefinite  weighting  matrix 
A,  then  the  above  generalized  SVD  still  gives  the  solution  to  minimizing 
the  weighted  prediction  error  (2.5)  even  though  the  covariance  rela¬ 
tionships  no  longer  hold  (Larimore,  1985a). 

For  a  full  order  state  model,  the  optimal  memory  or  state  x(t)  is 
related  to  the  past  p(t)  in  terms  of  the  first  k  canonical  variables  as 
m(t)  =  (Isub  k,  O)Jp(t),  i.e.  the  first  k  components  of  the  canonical  pre¬ 
dictor  variables  Jp(t).  A  minimal  order  realization  is  obtained  with  this 
choice  of  state.  The  computation  of  the  state  space  matricies  is  given  in 
Larimore  (1983)  in  Appendix  B.  The  state  space  matricies  and  noise 
covariance  matricies  are  given  by  a  linear  regression  as  specified  by  the 
state  space  equations  (2.3)  and  (2.4). 

In  system  identification,  the  covariance  raatrics  are  not  known  but  are 
estimated  from  the  observations.  The  statistical  determination  of  rank  in 
the  canonical  variate  analysis  is  given  approximatley  using  standard  cano¬ 
nical  correlation  analysis  methods  (Akaike,  1976).  A  more  refined  cora- 
pareson  between  the  different  order  models  is  given  by  use  of  the  Akaike 
information  criterion  (AIC)  which  is  asymptotically  optimal  in  minimizing 
entropy  (Shibata,  1981).  The  use  of  entropy  measures  such  as  the  AIC  has  a 
fundamental  justification  in  terms  of  the  basic  statistical  principles  of 
sufficiency  and  repeated  sampling  (Larimore,  1983a). 
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The  minimal  order  realization  is  unique  independent  of  the  weighting 
matrix  A,  but  when  a  reduced  memory  is  selected,  the  approximate  system 
does  not  in  general  minimize  the  prediction  error  for  that  order.  This  is 
because  the  reduced  rank  canonical  variables  are  not  in  general  recursively 
computable.  However  in  the  case  of  the  statistical  rank  determination 
problem,  there  is  an  insigificant  difference  between  the  state  of  the 
realized  system  corresponding  to  the  statistically  optimum  choice  of  order 
and  the  full  rank  canonical  variables. 

2.3  Relationship  with  the  Method  of  Instrumental  Variables 

The  instrumental  variables  method  gas  a  natural  interpretation  in  terms 
of  the  generalized  canonical  variate  problem.  In  the  instrumental 
variables  approach,  the  state  equations  (2.3)  are  considered  as  unobserved 
structural  relationships  that  are  indirectly  observed  through  the  noisy 
measurement  equations  (2.4).  A  vector  m(t)  of  instrumental  variables  is 
constructed  which  is  hopefully  close  to  the  true  state  x(t).  This  is  used 
in  place  of  the  true  state  in  solving  the  problem.  This  apparently  works 
well  for  an  appropriate  choice  of  the  instrumental  variables  when  the  true 
order  of  the  system  is  known  or  well  chosen.  In  other  cases,  this  approach 
may  lead  to  inaccurate  models. 

A  more  general  problem  is  the  optimal  choice  of  instrumental  variables 
for  a  specified  order  k  of  the  model  as  posed  by  Rao(L973,  1979)  (see  also 
Larimore,  1985a,  in  Appendix  A).  This  is  formulated  as  finding  the  optimal 
choice  of  k  linear  combinations  of  the  past  p(t)  that  predict  the  future 
f(t)  as  measure  in  terms  of  the  squared  error  (f-f)T  (f-f).  This  is  preci¬ 
sely  the  generalized  canonical  variate  problem  with  weighting  matrix  A  =  I. 
If  k  is  chosen  as  full  rank,  then  the  memory  and  the  state  space  realiza¬ 
tion  are  independent  of  the  weighting.  However,  for  lower  rank  k,  there 
can  be  a  considerable  difference  between  the  state  space  and  reduced  order 
system  (Larimore,  1983).  The  squared  error  of  instrumental  variables 
relates  to  energy  while  the  canonical  correlation  analysis  relates  to  the 
statistical  significance  of  the  problem.  Thus  the  canonical  correlation 
analysis  can  be  viewed  as  an  optimal  choice  of  the  instrumental  variables 
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using  the  appropriate  weighting  of  the  prediction  errors  for  the  deter¬ 
mination  of  the  statistically  significant  number  of  states. 

Time  recursive  methods  using  instrumental  variables  and  approximate 
maximum  likelihood  (IV-AML)  are  claimed  to  be  an  approximately  efficient 
parameter  identification  method  for  large  samples  as  shown  in  simulation 
examples  (Young  and  Jakeman,  1979).  This  is  shown  by  Monte  Carlo  simula¬ 
tion  and  by  estimating  the  parameter  estimation  error  covariance  matrix. 
Below  it  is  shown  by  Monte  Carlo  simulation  that  the  canonical  correlation 
method  also  gives  efficient  identification  of  the  system  dynamics.  This  is 
done  by  evaluating  the  spectral  estimation  error. 


2.4  Maximum  Likelihood  Efficiency  of  CVA 

The  canonical  variate  system  identification  procedure  has  been  found  in 
moderate  sample  sized  to  be  close  to  the  lower  bound  of  maximum  likelihood 
estimation.  There  is  no  proof  available  for  this,  however  simulations 
have  shown  this  to  be  the  case.  There  is  some  theory  to  suggest  why  cano¬ 
nical  variate  analysis  is  an  efficient  estimation  procedure. 

Conditional  upon  the  choice  of  the  state  vector  by  the  canonical 
variate  analysis,  the  computation  of  the  state  space  raatricies  by 
regression  is  a  maximum  likelihood  procedure.  The  difficulty  in  proving 
the  asymptotic  efficiency  of  CVA  is  that  for  correlated  time  series  there 
is  no  proof  that  CVA  gives  the  choice  of  state  that  will  result  in  maximum 
likelihood  estimates  unconditionally. 

The  lower  bound  for  estimating  the  power  spectrum  and  transfer  function 
is  given  in  Larimore  (1985a,  in  Appendix  A)  as  a  function  of  frequency. 

From  extensive  simulations,  the  canonical  variate  analysis  gives  an  iden¬ 
tified  system  within  the  lower  bound  error  of  the  maximum  likelihood  proce¬ 
dure  at  each  frequency  as  shown  in  Larimore,  Mahmood,  and  Mehra  (1984,  in 
Appendix  D). 

2.5  Input  Design  and  Sampling 

The  accuracy  of  the  identified  plant  model  and  subsequent  control  tracking 
error  depends  upon  the  sampling  rate,  sample  size,  the  presence  of  implicit 
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or  explicit  extra  input  signals,  and  the  presence  of  disturbance  or  output 
measurement  noise.  In  fact  the  presence  of  a  linear  feedback  control  pro¬ 
vides  no  information  for  identification  of  the  plant  (Ljung,  Gustafson,  and 
Soderstrom,  1974) ,  and  some  additional  input  signal  is  required  for  plant 
identif iability.  Recently,  Anderson( 1985)  has  shown  that  available  adap¬ 
tive  control  methods  that  do  not  have  persistent  excitation  of  the  system 
necessarily  exhibit  burst  phenomena  of  short  periods  with  large  tracking 
errors  when  the  system  parameters  drift  far  from  the  true. 

The  requirement  for  additional  information  is  easily  seen  since  the 
presence  of  a  linear  feedback  could  be  present  in  the  plant  internally  and 
the  actual  input  could  be  unconnected  to  the  system  and  still  give  exactly 
the  same  response.  On  the  presumption  of  a  strictly  linear  plant,  a  nonli¬ 
near  feedback  can  be  used  to  provide  identif iability .  Also  a  switching 
between  different  linear  feedback  systems  can  provide  identif iability.  A 
better  approach,  however  is  to  use  an  explicit  additional  input  excitation. 
Such  an  excitation  is  best  chosen  to  be  a  broad  band  noise  type  of  spectrum 
which  guarantees  that  it  is  persistently  exciting. 

In  some  applications,  there  are  implicit  excitations  such  as  wind  gust 
turbulence  on  an  aircraft  which  provide  some  information  about  the  plant. 

If  the  power  spectrum  of  the  turbulence  is  exactly  known  along  with  the 
input  coupling  to  the  plant  state,  then  this  can  provide  amplitude  information 
about  the  transfer  function  from  the  gust  input  to  the  output.  In  particular, 
the  relationship  between  the  observed  output  spectrum  Sy(z)  and  the 
assumed  input  noise  spectrum  Sn(z)  and  transfer  function  H(z)  is 

Sy(z)=H(z)Sn(z)H*(z)  (2.8) 

Unfortunately,  in  most  cases  this  is  not  very  helpful  since  the  gust 
spectrum  is  not  accurately  known  and  is  highly  variable  with  time.  Also 
the  gust  input  coupling  to  the  state  will  generally  be  different  than  the 
control  input.  Furthermore,  this  provides  only  amplitude  information,  and 
for  control  the  transfer  function  phase  can  be  crucial. 
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The  best  input  excitation  is  one  that  is  incorrelated  with  the  system 
state.  The  spectrum  of  the  input  excitation  can  be  chosen  on  the  basis  of 
the  plant  transfer  function,  and  the  disturbance  and  output  measurement 
noise  spectra.  The  resulting  plant  identification  error  expected  at  each 
frequency  is  a  complicated  function  of  the  above  power  spectrum  and 
transfer  functions  as  well  as  the  parameterization  of  the  model.  A 
detailed  derivation  and  description  of  the  transfer  function  and  noise 
spectrum  estimation  error  variance  at  each  frequency  is  given  in  Larimore 
(1985b,  in  Appendix  E).  These  expressions  are  complicated  but  can  be  used 
to  calculate  the  estimation  error  and  produce  simultaneous  confidence 
bounds  on  the  estimated  transfer  and  spectral  functions. 

An  additional  consideration  in  identification  accuracy  is  the  sample 
rate  and  rate  of  reidentification  of  the  system  or  equivalently  the  sample 
size.  The  issue  of  sample  rate  for  representing  a  continuous  time  system 
is  covered  in  Section  5.  The  primary  consideration  in  choosing  the  sample 
rate  is  to  insure  that  the  important  frequency  information  is  preserved  and 
that  the  higher  frequencies  of  no  interest  do  not  degrade  the  estimation  by 
aliasing.  For  large  sample,  the  sample  size  has  a  simple  relationship  to 
the  accuracy  of  the  identified  system  which  increases  proportional  to  the 
inverse  square  root  of  the  sample  size.  For  moderate  sample  sizes  of 
several  hundred  which  is  of  primary  interest,  this  relationship  can  be 
expected  to  hold  approximately. 

As  an  example  of  the  accuracy  bounds  that  are  obtainable  from  the 
methods  in  Larimore  (1985b,  in  Appendix  E) ,  consider  the  case  identifying 
the  transfer  function  of  an  ARMA(4,3)  model  discussed  in  Larimore  et  al 
(1984,  in  Appendix  D)  with  a  sample  of  800  which  is  observed  in  closed  loop 
with  a  white  noise  input  excitation  and  a  white  output  measurement  noise 
with  the  signal  to  noise  power  ratio  of  the  input  to  output  equal  to  0.10. 
Then  the  transfer  function  of  the  true,  identified,  and  simultaneous  con¬ 
fidence  bands  about  the  estimated  are  shown  in  figure  2.1.  The  confidence 
bands  contain  the  true  transfer  function  entirely  within  the  bands  across 
the  entire  frequency  range  with  probability  0.95.  Note  that  the  confidence 
bands  are  quite  tight  in  both  phase  and  amplitude.  For  a  lower  sample 
size,  the  confidence  bands  are  wider  by  a  factor  of  the  square  root  of  the 
sample  size. 
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Figure  2.1  Power  Spectral  Density  of  ARMA(4,3)  Process, 

True  (solid),  Estimated  (dashed),  and  Simultaneous 
Confidence  Band  (dotted). 
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2.6  Stochastic  and  Dual  Control 


In  stochastic  and  dual  control,  the  effect  of  the  stochastic  input  on 
both  plant  identification  and  control  tracking  error  is  taken  into  account. 
This  is  also  possible  in  the  adaptive  MAC  framework.  In  this  section,  we 
derive  the  tracking  error  as  a  function  of  the  stochastic  input  excitation, 
plant  disturbance  and  measurement  noise,  and  the  MAC  controller  plant 
mismodelling  error. 

The  closed-loop  transfer  function  from  the  plant  input  u(z)  and  the 
composite  plant  disturbance  and  measurement  noise  n(z)  as  seen  at  the 
plant  output  to  the  observed  output  y(z)  is  given  in  Section  4.2  and  can 
be  expressed  as 


/  x=  (z-l)H(z)u(z)+(z-l)n(z) 
;  (z"-l-a)I+ctR(z) 


(2.9) 


where  the  relative  error  R(z)  in  estimating  the  plant  transfer  function  is 
defined  as 


R(z)  =  H-l(z)[H(z)-H(z)] 


(2.10) 


Here  H(z)  is  the  true  and  H(z)  is  the  identified  plant  open  loop  transfer 
function.  Now  for  a  complex  differentiable  function  w  =  f(x)  of  a 
complex  random  variable  x  with  mean  y  ,  the  variance  of  the  function  is 
derived  from 


f  (a)  =  f  ( y )  +  f'  (y)  (x  -  y) 

which  holds  to  first  order  so 


(2.11) 
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E  |  f  (x)  -  f  (y)|  2  =  e|  [  f  (  x)  -  f(u)][f(x)  -  f(u)]* 

=  |  f '  (y)  I  2  E  I  x-u  I  2 


(2.12) 


In  the  context  of  the  identification  and  control  involving  different 
segments  of  data,  we  have  approximate  independence  between  the  processes 
u(z),  n(z)  and  the  transfer  function  relative  estimation  error  R(z). 

Thus  the  tracking  error  due  to  the  input  and  disturbance  excitation  as  well 
as  the  plant  modelling  error  is 


E  j  y(z)  J  2  =[|  G(z)|  2  Su(z)j  +  |  J(z)|  2Sn(z)][  1  + 


(z-l)+  a  2 


Var  [R( z ) ] ] 


(2.13) 


where  G(z)  and  J(z)  are  the  closed  loop  transfer  functions  from 
the  input  excitation  and  disturbance  noise  excitations  respectively 
to  the  plant  output,  and  where  Sr,  (z)  is  the  spectrum  of  the  plant 
disturbance  and  measurement  noise  as  seen  at  the  plant  output  in  open  loop 
operation. 

It  is  seen  that  as  the  input  excitation  is  increased,  the  control 
tracking  error  increases  for  a  fixed  relative  modeling  error  R(z)  ,  but 
the  increased  excitation  decreases  the  relative  error  in  identification. 
The  quantity  Var  [  R(z)  ]  ,  the  relative  squared  error  of  identifying  the 
transfer  function  is  derived  in  Larimore  (1985b,  Appendix  E).  This  is  a 
function  of  the  characteristics  of  the  plant  transfer  function  as  well  as 
those  of  the  process  and  disturbance  noise  spectrum  characteristics.  The 
expressions  for  computing  these  quantities  are  straight  forward  but  not 
easily  expressed  analytically.  Thus  as  in  the  stochastic  dual  control 
literature,  the  optimal  design  is  analytically  intractable  and  requires  a 
numerical  approach. 
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2.7  Computational  Considerations 


In  this  section  the  major  computational  steps  in  the  algorithm  are 
described.  The  detailed  computational  equations  are  contained  in  the 
appendices . 

The  computational  steps  in  the  identification  algorithm  are  shown  in 
Figure  2.2.  In  the  identification  of  the  plant,  first  the  covariance  among 
the  past  and  future  are  computed.  Second,  a  canonical  correlation  analysis 
between  the  past  and  future  is  performed.  From  this,  a  comparison  of  the 
various  state  space  model  orders  is  computed  using  the  AIC  criterion.  On 
the  basis  of  this,  the  best  state  order  is  selected  and  the  state  space 
matrices  computed  by  regression.  This  state  space  model  is  then  used  in 
the  MAC  controller.  The  detailed  computations  of  these  blocks  are  con¬ 
tained  in  Larimore  (1983,  in  Appendix  B)  except  for  the  AIC  computation. 

An  approximate  AIC  computation  is  given  in  Akaike  (1976)  as 


k 

AIC(k)  =  E  logl-Y2)  +2pk 

j  =  l 


(2.14) 


where  p^  is  the  number  of  parameters  fitted  in  the  model. 

To  evaluate  the  AIC,  the  number  of  free  parameters  adjusted  in  the 
canonical  variate  procedure  is  required.  For  a  state  space  model  of  state 
order  k  of  the  form  of  Equations  (2.3)  and  (2.4),  there  are  a  number  of 
implied  constraints  so  that  it  is  not  correct  to  simply  count  the  number  of 
elements  of  the  various  matricies.  The  number  of  functionally  independent 
free  parameters  p^  including  the  process  and  measurement  noise  covariance 
is  (Candy,  Bullock,  and  Warren,  1979) 


Pk  =2kn+n(n+l)/2+km+nm 


(2.15) 


where  n  and  m  are  the  vector  dimensions  of  the  number  of  outputs  and 
inputs  respectively  at  a  given  time.  If  there  is  no  instantaneous  feedfor- 
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ward,  then  the  term  nm  is  deleted,  while  if  there  is  no  input  the  terms 
km  +  nm  are  deleted. 

The  AIC  expression  (2.14)  is  only  approximate,  and  the  precise  eva- 

A 

luation  is  given  by  computing  the  state  space  model  0k  for  competing  order 
models  and  doing  an  exact  evaluation  of  the  AIC  by 


AIC(k)  =  -2  log  p(Y,0k)+2pk 


(2.16) 


The  state  order  is  chosen  which  minimized  the  AIC(k). 

The  major  computations  are  the  covariance  and  the  singular  value  decom¬ 
position.  Once  the  plant  state  order  is  determined,  the  computation  of  the 
state  space  matricies  requires  relatively  little  computation.  For  slow 
identification  rates,  the  computation  becomes  proportional  to  the  sample 
size  times  the  the  dimension  of  the  past  and  future,  while  for  fast  iden¬ 
tification  rates,  the  computation  is  proportional  to  the  cube  of  this 
dimension. 
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PART  2 


CHAPTER  3 

MULTIVARIABLE  MAC  IN  A  CLASSICAL  CONTROL  FRAMEWORK 


3,1  Introduction 


The  theoretical  properties  of  MAC  have  been  studied  in  details  in 
the  previous  report  (AFWAL-TR-80-3125)  using  the  impulse  response  (IR) 
model  of  the  plant.  The  reason  for  using  the  IR  description  of  the  * 
plant  is  that  the  MAC  software  (known  as  IDCOM)  uses  this  description 
of  the  internal  model  in  the  computation  of  the  control  sequence.  The 
IR  description  of  the  plant  is  the  basis  of  the  MAC  technique  where  a 
quadratic  optimization  problem  is  formulated  explicitly  in  terms  of 
the  future  control  sequence.  The  IR  description  of  the  plant  is 
superb  from  the  computational  point  of  view,  but  it  has  a  disadvantage 
that  this  description  is  not  parsimonious  i.e.  it  contains  too  many 
parameters  and  is  therefore  not  suitable  for  analytical  studies. 

Since  one  of  the  objective  of  this  project  is  to  investigate  analyti- 
caly  various  aspects  of  MAC,  the  MAC  technique  is  described  in  this 
chapter  in  terms  of  a  difference  equation  (DE)  model  of  the  plant. 

The  DE  description  usually  contains  far  fewer  number  of  parameters 
than  an  IR  description  and  is  therefore  suitable  for  analytical  studies 
if  a  low  order  plant  is  selected  in  the  analysis. 

There  is  no  mathematical  model  for  a  generalized  MAC  with 
multistep  ahead  optimization  horizon,  input  blocking,  input 
constraints  etc.  Therefore  it  is  not  possible  to  investigate  analyti¬ 
cally  the  properties  of  a  generalized  MAC  control  law.  The  MAC  stra- 
tegy  generates  an  optimal  control  sequence  by  on-line  optimization  of 
a  cost  functional  and  the  first  element  of  this  sequence  is  applied  to 
the  actual  system.  It  has  been  shown  in  an  earlier  report  that  if  the 
plant  is  minimum  phase  and  the  cost  functional  is  optimized  over  one 
step  ahead,  then  the  MAC  control  law  can  be  interpreted  in  a  classical 
control  framework.  In  this  chapter  we  extend  this  interpretation  to 
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multivariable  systems  and  indicated  how  the  robustness  of  MAC  can  be 
assessed  in  this  framework. 

Section  3.2  extend  the  earlier  descriptions  of  MAC  to  multi-input 
multi-output  (MIMO)  systems  which  shows  that  MIMO  MAC  can  also  be 
interpreted  in  a  standard  unity  feedback  configuration.  With  a  slight 
modification  of  this  configuration  it  is  shown  that  MAC  can  be 
explained  in  a  multivariable  root-locus  framework.  The  root-locus 
technique  gives  the  locations  of  the  closed-loop  poles  as  the  output- 
feedback  gain  is  changed  from  zero  to  infinity.  Usually  a  rational 
transfer  function  or  difference-equation  (DE)  model  of  the  plant  is 
used  in  this  technique.  Therefore  in  order  to  cast  MAC  technique  in  a 
root-locus  framework,  MAC  has  been  described  in  section  3.3  using  the 
DE  model  of  the  plant.  Using  this  analysis,  the  root-locus  interpre¬ 
tation  of  MAC  is  presented  in  section  3.4.  Finally  the  MAC  for  a 
lightly  damped  system  is  discussed  in  section  3.5  where  it  has  been 
shown  qualitatively  that  one  should  not  try  to  use  a  high  gain  output 
feedback  to  introduce  sufficient  damping  in  a  lightly  damped  system, 
otherwise  a  high  sampling  rate  may  have  to  be  selected.  Conclusions 
are  discussed  in  section  3.6. 

3.2  What  is  MAC?  -  An  Overview 


MAC  control  strategy  has  been  described  and  analyzed  in  earlier 
reports  and  publications  [1,4, 5, 6].  We  include  here  a  simple  descrip¬ 
tion  of  MAC  for  the  sake  of  completeness  of  this  report.  The 
following  is  an  extended  version  of  the  earlier  descriptions  for  MIMO 
plants . 

The  MAC  methodology  generates  a  control  sequence  by  on-line  opti¬ 
mization  of  a  cost  functional,  and  the  algorithm  is  suitable  for 
implementation  on  microprocessors.  One  of  the  attractive  features  of 
MAC  is  the  clear  and  transparent  relationship  between  system  perfor¬ 
mance  and  various  design  parameters  embedded  in  the  design  procedure. 
There  are  five  basic  elements  in  MAC  (we  assume  in  the  following  that 
the  input  sequence  u(n)  is  m-dimensional  and  output  sequence  y(n)  is 
p-dimensional) : 
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(i)  An  actual  stable  plant,  possibly  not  known  exactly,  with  a 
pulse  response  sequence  { Hn | ,  n=l,2,*..N  where  each  Hn  is  pxm  dimen¬ 
sional  matrix  (we  assume  for  simplicity  that  the  plant  has  no  time 
delay  and  is  purely  dynamic  i.e.  it  has  no  feedthrough  term).  Then 
the  input  sequence  u(n)  and  the  output  sequence  y(n)  are  related  by 

y(n)  =  Ri  u(n-l)  +  H2  u(n-2)  +  ...+  %  u(n-N)  (3,1a) 

or,  Y(z)  =  H(z)U(z)  (3.1b) 

where  U(z),  Y(z)  and  H(z)  are  z-transforms  of  u(n) ,  y(n)  and  {Hn} 

respectively. 

Here 

H(z)  =  +  H2z~2  +  ...+  Hnz“n  =  Hp(z)z”N 

where  Hp(z)  is  a  pxm  dimensional  polynomial  matrix  in  z  and  is 

given  by 

Hp( z)  =  HizN_1  +  H2zN“2  +...  +  hN  (3.1c) 

This  model  is  known  as  an  "all-zero’*  model  and  Hp(z)  determines  zeros 
of  the  plant.  The  locations  of  non-minimum  phase  zeros  impose 
restrictions  on  achievable  performance  of  MAC.  We  must  remind  the 
reader  that  the  physical  interpretation  of  zero  in  the  impulse 
response  model  of  the  plant  is  different  from  that  of  a  transmission  zero 
in  a  rational  transfer  function  model  or  equivalently  difference  equation 
(DE)  model)  of  the  plant.  In  the  same  way  the  physical  interpretation  of 
poles  as  natural  modes  of  a  plant  are  lost  in  this  description.  However 
this  point  will  be  elaborated  further  in  the  next  section. 

(ii)  An  internal  model  of  the  plant  having  the  same  input-output 
dimension  pxm  as  that  of  the  actual  plant  and  the  pulse  response 
sequence  t&n} ,  n  =  1,2,...S.  The  input  u(n)  is  the  same  as  that  to  the 
actual  plant  and  therefore  the  output  y(n)  of  the  model  is  given  by 

y(n)  =  Hi  u(n-l)  +  &2  u(n-2)  +...+  u(n-N)  (3.2a) 

or  ¥(z)  =  R(z)  U(z)  (3.2b) 

where,  as  before 
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(3.2c) 


H(z)  =  Hp( z )  z  N 

and  Hp(z)  is  a  pxm  dimensional  polynomial  matrix.  (fin }  is  generally 
different  from  vHnJ. 

(iii)  A  p-dimensional  reference  trajectory  yr(n),  preferably 
smooth,  initialized  on  the  current  output  of  the  actual  plant  y(n)  that 
leads  y(n)  to  a  possibly  time  varying  p-dimensional  set  point  c.  If  each 
of  the  reference  trajectories  yr£(n)  has  a  first  order  dynamics  with  time 
constant  leading  to  set  point  c^  i=l,2,...p  and  if  the  trajectories  do 
not  interact  with  each  other  then  yr(n)  evolves  as 

yr(n+l)  =  Aa  yr(n)  +  (I-Aa)c,  yr(n)  =  y(n)  (3.3a) 

or,  zYr(z)  =  Aa  Yr( z)  +  (I-Aa)  C(z)  (3.3b) 

where  Aa  =  diag  (a^) 

(iv)  A  closed  loop  prediction  scheme  for  predicting  the  future 
output  of  the  plant  according  to  the  scheme 

yp(n+l)  =  y(n+l )  +  yp(n)  -  y(n)  (3.4a) 

or,  Yp(z)  =  Y(z)  +  z-1  [Y(z)  -  ?(z)]  (3.4b) 

Here  yp(n)  is  p-dimensional. 

(v)  A  quadratic  cost  functional  J  based  on  the  error  between 
yp(n)  and  yr(n)  over  a  finite  horizon  Tn  (here  Tn  is  an  integer): 

Tn 

J  =  L  [eT(n+k)  W(n+k)  e(n+k)  +  (3.5a) 

k=l 

uT(n+k-l)  R(n+k-l)  u(n+l-l)] 

Tn 

=  Trl  [W(n+k)  e( n+k)  eT(n+k)  +  (3.5b) 

k=l 

R(n+k-l)  u(n+k-l)  uT(n+k-l)j 

where  W(.)  and  R( . )  are  positive  semi  definite  time  varying  weights  and 
e(n+k)  =  yp(n+k)  -  yr(n+k).  In  most  of  MAC  applications  R( . )  is  set  to 
be  zero. 
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Given  (i)-(v),  MAC  finds  as  optimal  control  sequence  |u*(n+i-l), 
i=l,...Tn}  by  minimizing  J  over  the  admissible  input  sequence 
|u(n+i-l)eft(i) ,  i=l...Tnj.  Once  the  optimal  control  sequence  is  com¬ 
puted,  the  first  element  of  the  sequence  is  applied  to  the  actual 
plant  and  the  process  repeats  all  over  again. 

In  general,  there  is  no  analytic  solution  for  the  control 
sequence  |u*(n)}  -  it  is  computed  at  each  step  using  an  algorithm 
known  as  IDCOM.  In  its  greatest  generality,  MAC  cannot  be  put  into  a 
classical  control  framework.  However  under  the  following  simplifying 
assumptions  MAC  is  equivalent  to  an  inverse-control  law  and  can  be 
modelled  as  a  feedback  configuration. 

(i)  The  actual  plant  H(z)  is  minimum  phase; 

(ii)  The  plant  model  fi(z)  is  minimum  phase; 

(iii)  There  are  no  input  constraints,  i.e.  ft(i)  »  Rm  for  all  i; 

(iv)  Tn=l  i.e.  the  optimization  is  carried  over  one  future  step 
ahead:  under  this  condition  MAC  is  a  one-step  ahead  pre¬ 

dictive  controller. 

Under  these  simplifying  assumptions,  it  is  sufficient  to  select 
u*(n)  to  satisfy 

yp(n+l)  *  yr(n+l)  for  all  n  >  0  (3.6) 

for  a  minimum  of  the  cost  function  J.  The  assumptions  (i)-(iii) 
ensure  the  existence  of  an  optimum  control  u*(n)  that  satisifies 
(3.6)  -  the  resulting  optimal  cost  J*  is  zero  in  this  case.  However 
U*( z )  is  then  implicitly  generated  by  Yp(z)  =  Yr(z)  so  that 

U*(z)  =  [(z-l)ft(z)  +  <I-A0)h(z)]-1  [I-Aa]C(z)  (3.7a) 

Y(z)  =  H(z)  [(z-l)B(z)  +  (I-Aa)H(z)]-l  [I-Aa] C( z)  (3.7b) 

Equations  (3.7a)  and  (3.7b)  relate  the  setpoint  C(z)  with  the  optimal 
input  sequence  U*(z)  and  output  sequence  Y(z).  It  is  easy  to  see  that 
this  simplified  form  of  MAC  is  equivalent  to  the  following  M1M0  unity 
feedback  configuration  (we  have  henceforth  dropped  the  *  superscript). 
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y(z) 


Figure  1  MIMO  MAC  in  a  classical  framework 

To  see  that  the  setup  in  Figure  3.1  indeed  represents  equation  (3.7), 
note  that  at  point  1  we  have, 

U(z)  =  (z)(I-Aa)E(z) 

=  1  (z)(I-Aa)  [C( z)  -  H( z ) U( z )  ] 

Multiplying  both  sides  of  this  equation  by  (z-l)fl(z)  and  rearranging 
we  have, 

[(z-l)ft(z)  +  (I-Aa)H(z)]  U(z)  =  (I-Aa)C(z) 
from  which  (3.7a)  and  (3.7b)  follow.  The  block  within  the  dashed  line  can 
be  thought  of  as  a  dynamic  controller  of  the  classical  type.  The  loop 
transfer  function  when  the  loop  is  broken  at  the  plant  input  (point  1) 
is  given  by 

L(z)  =  -zitfl"1(z)(I-Acx)H(z)  (3‘8) 

and  determines  the  robustness  of  the  feedback  configuration  at  this 
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point.  When  we  have  perfect  identification  i.e.  H(z)  =  ft(z),  then 
points  2  and  3  are  the  same  in  Figure  3.1  and 
U(z)  =  Y(z)  =  (I-Aa)E(z) 

or,  U(z)  =  (I-Aa)  [C(z)-U(z)  ] 

or,  zU(z)  =  Aa  U(z)  *F  (I-Aa)C(z)  (3.9) 

Equation  (3.9)  is  equivalent  to 

u(n+l)  =  Aa  u(n)  +  ( I-Aa)c( n) ,  u(n)  =  y(n) 
which  shows  that  u(n)  is  the  reference  trajectory  sequence  yr(n)  as 
shown  in  equation  (3.3a).  This  means  that  when  the  plant  model  is 
known  exactly,  the  control  sequence  U(z)  is  generated  as 

U(z)  =  H_1(z)U(z)  =  H-l(z)Yr(z)  (3.10a) 

Therefore  the  output  of  the  actual  plant  is 

Y(z)  =  H(z)U(z)  -  Yr( z )  (3.10b) 

which  shows  that,  in  steady  state,  the  plant  output  y(n)  is  identical 
to  the  reference  trajectory  yr(n)  -  perfect  tracking  has  been 
achieved.  Equation  (3.10a)  clearly  shows  the  need  for  minimum  phaseness  of 
H(z).  This  analysis  has  revealed  another  interesting  property  of  MAC. 

Exact  tracking  could  as  well  be  achieved  by  inverting  the  plant  to 
generate  the  sequence  u(n)  in  an  open-loop  configuration,  but  in  MAC  it 
does  so  in  a  closed-loop  configuration  and  therefore  the  additional 
benefits  of  a  feed-back  configuration  such  as  disturbance  rejection, 
sensitivity  reduction,  etc  are  also  obtained  at  the  same  time  while 
achieving  exact  tracking. 

Further  insight  is  available  if  we  interpret  the  above  equations 
for  SISO  plants.  The  loop  variables  for  SISO  plants  are  denoted  by 
corresponding  small  letters,  e.g.  h(z)  is  a  transfer  function  for  a 
SISO  plant  and  H(z)  is  that  for  a  MIMO  plant.  Also  for  a  SISO  loop  Aa  =  a, 
and  the  Figure  3.1  takes  the  following  simple  form: 
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Figure  3.2  SISO  MAC  as  a  classical  controller 

Note  that  in  this  figure  (1-a)  can  be  treated  as  a  gain  and  the  usual 
classical  root-locus  technique  can  be  applied  to  analyze  the  behavior 
of  the  closed  loop  poles  as  a  changes  from  0  to  1 .  But  since  the 
impulse  response  description  of  a  plant  has  too  many  poles  and  zeros, 
the  root-locus  technique  will  not  be  useful  and  this  is  why  we  intend 
to  describe  MAC  in  terms  of  a  difference-equation  (DE)  model  of  the 
plant  in  the  next  section. 


3*3  Lightly  damped  system  in  terms  of  difference  equation  (DE)  and 

impulse  response  (IR)  model 

Consider  a  generic  lumped  parameter  linear  time-invariant  (LTI) 
system 


x(t)  =  Ax(t)  +  Bu( t ) ,  x(0)  =  xq  (3.11a) 

y(t)  =  Cx( t )  (3.11b) 

where  x(t),  u(t)  and  y(t)  are  n-,  ra-  and  p-dimensional  vectors  repre¬ 
senting  the  states,  inputs  and  outputs  respectively  and  A,  B,  C  have 
appropriate  dimensions.  The  corresponding  frequency  domain  descrip¬ 
tion  is 


X(s)  -  ®(s)BU(s)  and  Y(s)  =  C$(s)BU(s)  =  Hc(s)U(s)  (3.12) 

where  $(s)  =  (sI-A)~l  and  Hc(s)  is  the  impulse  response  of  the  system. 
If  Ai  =  ±  is  the  i-th  eigenvalue  of  A,  then  the  system  is 
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asymptotically  stable  if  <  0  for  each  i  and  in  this  case  each  ele¬ 
ment  of  Hc(s)  is  analytic  in  the  closed  right  half  plane.  On  the 
otherhand,  the  system  is  unstable  if  >  0.  If  Hc(t)  is  the  inverse 
Laplace  Transform  of  Hc(s),  then  for  asymptotically  stable  systems 
each  element  of  Hc(t)  approaches  zero  as  t*00,  whereas  for  an  unstable 
system  some  element  diverges.  If  the  impulse  response  Hc(t)  of  a 
system  takes  a  long  time  to  settle  down  to  zero,  the  system  is 
generally  known  as  a  lightly  damped  system.  The  damping  ratio  asso¬ 
ciated  with  the  i-th  complex  pole-pair  Xj*  =  ±  is  defined  as 


ccX2  + 


(3.13) 


so  that  0  <  <  1.  The  system  is  lightly  damped  if  £i  is  small  which 

results  when  |a^|  is  small,  i.e.  the  system  is  lightly  damped  when  at 
least  one  of  the  poles  lies  near  jw-axis.  These  systems  show  unde¬ 
sirable  behavior  of  "ringing”  and  excessive  "overshoot"  in  open-loop 
transient  response.  The  impulse  response  of  these  systems  decays  to 
zero  very  slowly,  and  therefore  a  large  amount  of  data  must  be  stored 
in  the  computer  for  representing  the  impulse  response  sequence  model 
of  the  plant  which  directly  affects  MAC  computation. 

Since  MAC  is  a  digitally  implemented  control  algorithm,  we  must 
find  a  sampled-data  version  of  (3.11).  There  are  several  ways  of 
implementing  digital  control  schemes  -  one  of  these  is  the  sample  and 
zero-order  hold  mechanism  which  is  equivalent  to  discretizing  (3.11) 
by  using  an  exponential  transform.  In  this  method  the  input  is 
sampled  every  T  seconds  and  held  constant,  i.e.  u(t)  =  u(n)  ,  nT  <  At  < 
(n+1 )T  between  the  two  sampling  instant.  In  this  case  the  z-domain 
and  s-domain  descriptions  are  related  through 

z  =  esT  (3.14) 

and  the  corresponding  discrete-time  system  in  state-space  description 
is 


x(n+l)  =  Fx(n)  +  Gu(n) ,  x(0)  =  xq 


(3.15a) 


y(n)  =  Cx(n)  (3.15b) 

where  F  =  exp  (AT),  G  =  (F-I)A“^B,  provided  that  A“^  exists,  otherwise 


T 

G  =  /  exp  (Aw)dwB  (3.15c) 

0 

If  the  system  (3.15)  is  asymptotically  stable,  the  zero-state  solu¬ 
tion  of  (3.15)  is  given  by 

n-1 

y(n)  =  l  Hn_i  U( i) ,  Hn  =  CF^-l  G,  n  >  1  (3.16a) 

i=0 

which  is  the  familiar  discrete-time  convolution.  Notice  that  if  T  is 
very  small,  to  the  extent  that  max  | A-^ j T |  «  1,  where  Aij  is  the 

ij 

(i,j)-th  element  of  A,  then 

Hn  -  C  exp( A(n-1)T)BT  (3.16b) 

which  also  results  if  the  integral  in  (3.15c)  is  approximated  by  the  lower 
Riemann  sum.  Taking  the  z-transform  of  (3.15a)  -  (3.15b)  we  get  the 
frequency  domain  description, 

Y(z)  -  Hd(z)U(z),  (3.17a) 

where 

Hd(z)  =  C(zI-F)_1G  (3.17b) 

The  power  series  expansion  of  H(z)  gives 

Hd(z)  =  C( I/z  +  F/z2  +...)  G  =  l  Hnz-n  (3.18a) 

n 

with  the  region  of  convergence  (ROC)  |z|  >  max  |Xj_(F)|.  We  can 

i 

recover  { Hn }  from  Hd(z)  using  a  Cauchy  Integral  as  follows 

Hn  =  1  Hd(z)zn_1  dz  =  cFn_1  G  (3.18b) 

2lTJ 

which  is  the  same  in  (3.16a). 

Ideally  an  IR  sequence  { Hn }  computed  in  the  above  manner  has  an 
infinite  number  of  terms.  Since  MAC  uses  in  its  internal  algorithm  a 
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finite  impulse  response  sequence  { Hn } ,  the  matrix  valued  sequence  { Hn } 
must  be  a  fast  converging  one.  The  poles  in  the  continuous  time 
system  and  those  of  the  sampled-data  system  are  related  by  z ^  = 
exp  (A^).  Therefore  the  discrete  time  system  is  unstable  if  |zj_|  >  1 

for  any  i  and  is  a  lightly  damped  system  if  |zjj<l  but  close  to  unit 

circle  i.e.  |z-j_|-l.  In  the  earlier  case  {Hn}  diverges  and  in  the 
later  case  {Hn}  has  a  very  large  number  of  terms  before  it  converges 
to  zero.  If  the  system  is  asymptotically  stable  {Hn}  converges,  and 
given  e  >  0  we  can  always  find  an  integer  N(e)  such  that  11  Hn  II  <  e 

for  all  n  >  N  and  we  can  truncate  the  impulse  response  sequence  to  any 

desired  degree  of  accuracy.  The  finite  impulse  response  description 
is  also  known  to  practicing  engineers  as  a  moving  average  (MA)  or  all 
zero  model  of  the  plant. 

Now  suppose  that  an  impulse  response  has  been  truncated  to  obtain 
a  finite  sequence  {Hnj  -  { H| , H2 • • • } .  MAC  uses  this  description  of 
the  plant  model  as  shown  in  section  3.2  for  a  lightly  damped  system. 
This  sequence  is  relatively  long.  The  z-transform  H(z)  is  given  by 

N 

H(z)  =  l  Hn  z“n-  (3.19) 

n=l 

Comparing  with  (3.18a)  we  find  that 

Hd(z)  -  H( z ) ,  |z|  »  1.0.  (3.20) 

Here  H^(z)  will  be  called  a  difference  equation  (DE)  description  and 
H(z)  an  impulse  response  description.  Although  Hd(z)  and  H(z)  are 
approximately  equal  for  all  z  within  the  region  of  convergence,  the 
physical  interpretation  associated  with  the  two  description  are 
different.  To  see  the  difference  clearly,  consider  a  SISO  plant  in 
which  case  Hd(z)  and  H(z)  are  complex  scalars  and  represented  respec¬ 
tively  by  HfjCz)  and  h(z).  Then 


hd(z) 


b(z) 

a(z) 


where  a(z)  and  b(z)  are  polynomials  in  z,  b(z)  having  a  lower  degree 
than  a(z)  for  a  causal  system.  The  zeros  of  the  denominator  a(z)  are 
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the  'poles'  of  the  system  hd(z)  and  are  associated  with  the  natural 
modes  of  the  system.  The  impulse  response  (IR)  of  the  system  is  com¬ 
posed  of  these  modes.  The  zeros  of  b(z)  are  transmission  zeros  of  the 
plant  which  have  the  physical  interpretation  that  if  zd  is  a  zero  of 
the  plant  and  if  zj_  is  also  a  mode  of  the  input  to  the  plant,  then 
this  mode  of  the  input  is  blocked  by  the  plant  and  does  not  appear  at 
the  output.  On  the  otherhand  the  IR  description  h(z)  can  also  be 
written  as 


h(z) 


n(z) 

d(z) 


where  n(z)  and  d(z)  are  polynomials  in  z.  Here  d(z)  =  z^,  and  n(z)  is 
a  polynomial  of  degree  N.  This  shows  that  h(z)  has  N  poles  at  the 
origin  and  N  zeros  -  but  these  poles  and  zeros  do  not  have  any  physi¬ 
cal  significance  as  in  the  rational  transfer  function  model  hd(z). 


Since  we  want  to  explain  the  behavior  of  MAC  in  terms  of  standard 
pole-zero  configuration,  our  immediate  objective  is  to  describe  MAC 
using  a  difference  equation  model. 


3.4  MAC  with  Difference-Equation  Model:  a  Root  Locus  Approach 

Consider  again  a  pxm  dimensional  MIMO  plant  Hd(z)  with  input  U(z) 
and  output  Y(z).  Then  parallel  to  the  description  of  MAC  in  section 
3.2,  we  can  describe  the  various  elements  of  MAC  as  follows: 

(i)  The  actual  plant  described  by 

Y(z)  =  Hd(z)  U(z)  (3.21) 

(ii)  The  internal  model  of  the  plant,  also  described  by  a 
rational  transfer  function  description  and  given  by 

Y(z)  =  Hd(z)  U(z)  (3.22) 

(iii)  A  p-dimensional  reference  trajectory  yr(n)  which  evolves  aS 
yr(n+l)  =  Aa  yr(n)  +  (I-Aa)c,  yr(n)  =  y(n) 
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or, 

(iv) 


(3.23) 


zYr(z)  =  Aa  Y(z)  +  (I-Aa)C(z) 
a  closed-loop  prediction  scheme  yp(n)  for  predicting  the 
future  output  of  the  plant,  according  to  the  scheme 
yp(n+l)  =  y( n+1 )  +  yp(n)  -  y(n) 

Yp( z )  =  Y(z)  +  z-1  [Y( z)  -  Y( z) ]  (3.24) 

(v)  and  a  cost  functional  as  in  (3.5) 

If  we  compare  the  expressions  in  (  3 . 2  1 ) -(  3 . 24 )  with  those  in 
(32. lb)-(3.4b) ,  we  see  that  these  expressions  are  the  same  mathemati¬ 
cally  although  in  ( 3 . 1 b)-( 3 . 4b)  we  have  used  the  IR  description  of  the 
plant  whereas  in  (3.21)— (3.24)  we  have  used  the  DE  (rational  transfer 
function)  model  of  the  plant.  This  comparison  reveals  the  important 
fact  that  the  basic  principle  of  MAC  does  not  depend  upon  the  model 
description  of  the  plant  i.e.,  whether  the  model  is  described  using  a 
difference  equation  or  impulse  response.  Therefore,  for  a  one-step 
ahead  prediction  horizon,  the  interpretation  of  MAC  as  a  feedback  con¬ 
figuration  (as  shown  in  Figure  3.1)  is  also  applicable  in  this  case. 
The  important  difference  in  this  case  is  that  if  we  use  the  DE  model 
of  the  plant,  we  can  associate  the  traditional  pole-zero  interpreta¬ 
tion  to  MAC.  Indeed  if  we  choose  aj_  =  a  making  the  dynamics  of  all 
the  reference  trajectories  the  same,  then  we  have  Aa  =  al  and  the 
Figure  3.1  then  is  a  familiar  unit  feedback  MIMO  configuration  with 
(1-a)  playing  the  role  of  a  varying  gain.  There  are  two  advantages  of 
this  conf iguration : 

(i)  the  closed  loop  pole  position  can  be  ascertained  apriori 
using  the  multivariable  root-locus  approach; 

(ii)  robustness  of  the  closed  loop  can  be  examined  in  terms  of 

the  recently  developed  criteria  employing  the  loop  transfer 
function  and  return  difference  function  at  appropriate 
points  in  the  loop. 
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3.4.1  Root  Locus  Analysis  of  MAC 

Consider,  for  simplicity,  a  SISO  plant  with  an  actual  transfer 
function  h(z)  and  suppose  that  its  model  is  given  by  Fi(z).  We  assume 
that  both  the  plant  and  this  internal  model  is  described  by  difference 
equation.  Note  that  we  have  dropped  the  subscript  d  here  from  the 
previous  section  for  notational  convenience. 

It  is  not  obvious  how  y(n)  will  be  affected  as  a  changes,  but  the 
effect  can  be  analyzed  as  if  we  are  finding  the  root  locus  of  the 
closed-loop  configuration  in  Figure  3.2.  We  can  consider  both  a  and 
h(z)  as  parameters  that  can  be  varied  to  regulate  the  closed-loop 
behavior  of  the  system.  Indeed  if, 


n(z)  =  h(z) 


(3.25) 


d(z)  (z-l)Ft(z) 


wher  n(z),  d(z)  are  polynomials  in  z, 


h(z)  =  plant  transfer  function  in  DE  description 
fi(z)  =  model  of  the  plant  in  DE  description 


the  closed  loop  poles  will  trace  a  continuous  path  from  the  open-loop 
poles  (i.e.  poles  of  the  plant,  the  zeros  of  the  model  and  the  zero  at 
z=l)  to  the  open-loop  zeros  (i.e.  poles  of  fi(z)  and  zeros  of  h(z))  as 
the  gain  varies  from  0  to  infinity.  But  here  the  gain  (1-a)  varies  from  0 
to  1  as  a  varies  form  0  to  1.  So  the  closed  loop  poles  trace  a  path 
from  the  open  loop  poles  to  somewhere  towards  the  open  loop  zeros.  To 
put  the  problem  into  a  standard  framework  of  root  locus,  we  introduce 
a  one-to-one  invertible  mapping: 


(3.26) 


so  that  as  a  changes  from  0  to  1 ,  B  changes  from  0  to  infinity. 


Let 


From  Figure  3.2  it  can  be  shown  that  the 


input-output  of  the  closed- 


loop  is  given  by 
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hcl(z)  =  O.  y(z)  =  hc^(z)  c(z) 

L  d+( l-a)n 


where  for  simplicity  we  have  written 

n  =  n^(z)  dh(z)>  and  d  =  (z-1)  nh(z)  d^(z). 
For  convenience  henceforth  we  shall  suppress  the  argument  z. 


the  transformation  a  = 


hci(z)  = 


1+3 


gives 


deq  +  3neq 


where 


^eq  =  nh^h  +  (z-On^d^ 
n^q  ~  d  —  ( z-1 ) n^d^ 

The  closed  loop  characteristic  polynomial  is 
^  cl ( z )  =  deq(z)  +  3neq(z) 

It  is  obvious  that 


(3.27) 

Using 

(3.28) 


(3.28a) 


(i)  as  3+0  i.e.  a+0  (fast  reference  trajectory),  the  closed 

loop  poles  approach  the  zeros  of  deq(z)  =  n^3^  +  (z-l)n^d^# 

Depending  on  the  characteristics  of  this  polynomial  the 
closed  loop  response  may  be  oscillatory,  damped  and/or 
unstable. 


(ii)  as  3+  infinity  i.e.  a+1  (slow  reference  trajectory),  the 
closed  loop  poles  approach  the  zeros  of  neq(z),  i.e.  one 
pole  approaches  +1  and  the  remaining  poles  approach  the 
poles  of  the  plant  and  the  transmission  zeros  of  the  model. 
The  pole  at  z=l  will  contribute  to  the  sluggish  response  of 
the  closed  loop  system. 


So  the  problem  of  obtaining  a  specific  response  from  MAC  can  be 
translated  into  the  design  of  the  polynomials  neq(z)  and  deq(z).  If 
the  open  loop  poles  are  not  located  in  the  appropriate  region  of  the 
z-plane,  we  can  choose  the  model  of  the  plant,  i.e.  n^  and  d^  such 
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that  the  zeros  of  the  polynomial  deq(z)  are  placed  accordingly  and  the 
specific  response  can  be  obtained  asymptotically  as  &+0.  Note  that 
the  stability  of  the  plant  or  the  model  is  not  required  when  analyzing 
MAC  in  a  root  locus  framework.  The  problem  is  algebraic  in  nature, 
i.e.,  is  a  problem  of  synthesizing  a  specific  polynomial  deq(z). 

3.4.2  Examples 

In  this  section  we  will  demonstrate  the  above  analysis  through  a 
simple  example. 

Example  3.1 

Consider  a  scalar  dynamic  system 

x(t)  =  ax(t)  +  bu(t),  x(0)  =  xq 
y(t)  =  cx(t). 


(3.29a) 

(3.29b) 


Suppose  the  input  and  ouput  are  sampled  every  T  seconds.  Then  the 
corresponding  discrete-time  (scalar  dynamic  system,  as  obtained  by 
using  the  exponential  transform  (3.14),  is 

x(n+l)  =  f x(n)  +  gu(n)  (3.30a) 
y(n)  =  cx(n) ,  (3.30b) 


where 


f 


the  plant  is 


eaT  and  g  =  ea^~  1  • 
a 


Now  suppose  that  the  model  of 


x(n+l)  =  ?x(n)  +  gu(n)  (3.31a) 

y(n)  =  cx(n)  (3.31b) 


For  simplicity,  let  us  chose  c=l/g.  Then  using  the  notation  of  the 
previous  section  we  have 


h(  z) 


1 

z-f  = 


nh(z) 

dhU>  ’ 


rih(z) 
cfh(zT  * 
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Therefore,  using  the  notation  in  (3.28), 

deq  =  nhdh  +  (z-l)nhdft  =  z-f  +  (z-l)(z-f), 
neq  =  (z“^-)dhn^  =  (z-l)(z-f), 

and  the  closed  loop  characteristics  polynomial  is 
4>cl(z)  deq(  z )  4*  8ngq(z) 

=  (z-f)  +  (z-l)(z-f)  +  8(  z-1 )  ( z-f ) . 

As  (3+0  (i.e.  a+0 :  a  fast  trajectory),  the  closed  loop  characteristic 
polynominal  asymptotically  approaches 

<t>cl(z)+  Z2-fz+(f-£) 
and  the  closed  loop  poles  approach 

z i >2  -  f±  /f2-4(f-f).  (3.32) 

Suppose  f=0#9  and  the  model  is  perfect,  i.e,  £=0-9  too.  Then 
since  the  plant  is  minimum  phase,  the  closed-loop  transfer  function 
for  all  values  of  8  is,  from  (3.27), 

hcl(z)  =  (3.33) 

z-a 

i.e.,  the  perfect  tracking  has  been  achieved.  This  is  shown  in  Figure 
3.3,  for  a=0*01. 

When  £  =  0*1,  and  equation  (3.32)  indicates  that  the  closed  loop 
poles  approach  0*45±j0*77.  The  closed  loop  response  therefore  is 
oscillatory  which  is  demonstrated  in  Figure  3.4  for  a=0*01.  If 
f=-0*l,  the  closed  loop  poles  approach  O*45±jO09O  -  the  closed  loop 
response  becomes  further  oscillatory,  which  is  shown  in  Figure  3.5  for 
the  same  value  of  a.  Similary  a  choice  of  £=-0.4  places  the  closed 
loop  poles  at  0*45±j 1*047  and  the  simulation  has  indeed  shown  the 
instability. 
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I NPUT ( DEGREE )  '  OUTPUT ( DEGREE ) 

-20.00  -10.00  0.00  10.00  -5.00  5.00  15.00  25.00 


INPUT  FILE: 

TfiU: 

.01 

CHE1GHT1NG: 

0.0 

0HEIGHT1NG: 

1.0 

1  BLOCKING: 

1 

I - 

0.00 


1.00  2.00  3.00 

T I  ME ( SEC  ) 


4.00  5.00 


6.00 


0.00 


1  .00 


2.00  3-00 

T I  ME ( SEC  ) 


4.00 


5.00 


6.00 


Figure  3.3  Perfect  tracking  when  the  model  is  exact. 
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It  may  also  be  noted  that  as  3+°°,  one  of  the  closed  loop  poles 
approaches  +1.0  which  implies  the  loss  of  asymptotic  stability  or  a 
very  sluggish  response  therefore  highly  undesirable  to  operate  MAC 
with  a  large  3  (or,  a  nearly  equal  to  1).  But  the  problem  with 
smaller  3  is  that,  along  with  a  fast  response,  the  bandwidth  of  the 
closed  loop  system  is  increased  and  the  possibility  of  excitation  of 
the  unmodelled  dynamics  is  also  increased.  A  compromise,  therefore, 
is  needed  while  choosing  the  value  of  3. 

To  see  how  the  root-locus  interpretation  helps  in  determining  MAC 
behavior,  let  us  consider  the  case  of  a  perfect  model  and  assume  that 
the  system  is  minimum  phase.  Then  from  equation  (3.2), 

n(z)  =  1 

d(z)  z-1 

and 

nh  *  nh>  ^h  =  seq  =  zn^d^,  neq  =  (z-ljn^ 

c()u(z)  -  znhdh  +  3(z-l)  nhdh 

Clearly  then  as  3+0,  (or  a+  0),  one  closed  loop  pole  approaches  the 
origin  z=0  and  the  others  approach  the  zeros  of  n^d^.  The  later 
poles,  however,  get  cancelled  eventually  (indicating  that  these  mode 
become  asymptotically  either  unobservable  or  uncontrollable)  and  the 
pole  at  z=0  becomes  dominant,  and  a  fast  response  is  available  from 
MAC.  On  the  other  hand  as  3+°°  (or,  a+1),  the  dominant  pole  is  the  one 
as  z=l  and  a  sluggish  response  is  obtained.  All  of  these  analyses 
agree  with  the  observed  behavior  of  MAC  in  everyday  use. 

3.5  Apriori  fixed  Gain  Compensation  of  a  Lightly  Damped  System 

or  Unstable  System. 

A  lightly  damped  system  has  a  long  impulse  response  (IR)  sequence 
and  therefore  Imposes  burden  on  the  computer  storage.  If  the  impulse 
response  is  sampled  according  to  Nyquist  sampling  rate,  an  impulse 
response  sequence  of  60-150  elements  are  very  common  for  a  lightly 
damped  system,  particularly  if  the  system  has  a  frequency  mode.  It 
has  been  proposed  that  some  additional  damping  may  be  introduced  into 
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the  system  by  applying  output  feedback  and  then  MAC  be  applied  to  the 
overall  system.  It  is  the  purpose  of  this  section  to  investigate  if 
apriori  fixed  gain  output  feedback  can  be  useful  for  MAC  application. 
Since  there  is  no  mathematical  model  available  for  a  standard  regular 
MAC  with  multistep  prediction  horizon,  input  blocking  etc,  we  can  not 
investigate  analytically  the  effect  of  apriori  output  feedback  on  MAC. 
So  the  following  analysis  is  based  on  the  available  properties  of  out¬ 
put  feedback  and  our  analysis  is  more  qualitative  then  quantitative. 

We  shall  primarily  emphasize  on  the  issue  that  whether  we  can  make  the 
length  of  impulse  response  shorter  using  apriori  fixed  gain  analog 
compensation  of  the  plant. 

3.5.1  Qualitative  Analysis 

In  Section  3.4,  we  have  characterised  a  lightly  damped  system  by 
its  pole  positions.  Roughly  speaking,  a  continuous  time  dynamic 
system  is  lightly  damped  if  any  of  its  poles  lies  near  the  jto-axis  in 
the  s-plane.  Similarly,  in  discrete  time  domain,  a  system  is  lightly 
damped  if  any  pole  lies  near  the  unit  circle  on  the  z-plane. 

Physically  it  means  tha  the  impulse  response  (IR)  or  the  IR  sequence 
is  relatively  longer.  This  fact  plays  an  important  role  in  the  MAC 
technique,  because  the  latter  uses  the  IR  descritpion  of  the  plant.  A 
lightly  damped  system  has  a  relatively  longer  IR  sequence  and  there¬ 
fore  uses  more  computer  storage  compared  with  a  damped  system.  Since 
an  unstable  system  has  an  infinitely  long  IR  sequence,  the  current  MAC 
implementation  using  the  IR  can  not  handle  such  systems. 

If  a  system  is  open-loop  unstable  or  lightly  damped,  it  can  be 
made  stable  or  damping  can  be  added  apriori  using  constant  or  dynamic 
output  feedback.  The  compensated  plant  with  possibly  a  shorter  IR 
sequence  can  be  thought  of  as  a  new  plant  and  MAC  can  then  be  applied 
to  it  for  improved  performance  -  the  overall  configuration  is  hybrid 
in  nature.  For  simplicity,  consider  again  a  SISO  plant 

x ( t )  =  Ax( t )  +  bu( t ) ,  x(0)=xq  (3.34a) 
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Figure  3.4  Oscillatory  behavior  for  a  fast  trajectory. 
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INPUT (DEGREE)  OUTPUT ( DEGREE ) 

-20.00  -10.00  0.00  10.00  -5.00  5.00  15-00  25.00 


0.00  1.00 


H - 1 - 1 - 1 - 1 

2.00  3.00  4.00  5-00  6.00 

T I  ME ( SEC  ) 


Figure  3.5  Oscillatory  behavior  of  MAC  as  predicted  by  analysis. 
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y(t)  =  cx(t) , 


(3.34b) 


where  x(t)  is  n-dimensional  and  A,b,c  have  appropriate  dimensions.  If 
an  output  feedback  control  law  is  chosen  of  the  form 

u(t)  =  -ky(t)  +  v(t),  (3.35) 

the  closed  loop  system  is  given  by 

x ( t )  =  (A-bkc)x(t)  +  bv(t),  x(0)=xq  (3.36a) 

y( t)  =  cx( t ) ,  (3.36b) 

and  the  closed-loop  poles  are  given  by  the  eignevalues  of  A-bkc.  The 
hybrid  system  as  a  result  of  application  of  MAC  is  shown  in  Figure 

3.6. 


The  speed  of  response  and  bandwidth  of  the  system  can  be  increased 
using  output  feedback.  This  makes  it  necessary  to  use  a  higher 
sampling  rate  for  the  compensated  plant.  This  point  needs  some  clari¬ 
fication.  Although  the  Nyquist  criteria  holds  for  bandlimited 
signals,  engineers  have  selected  sampling  rates  according  to  this  cri¬ 
teria,  whether  the  signal  is  bandlimited  or  not,  i.e.,  a  rate  of  at 
least  twice  the  highest  frequency  in  the  oscillatory  modes  in  a  plant. 
Similarly  in  a  system  without  any  oscillatory  modes,  the  sampling  rate 
is  selected  at  a  rate  determined  by  the  ’’Bandwidth  (BW)”  of  the 
system.  We  may  recall  that  the  BW  of  such  systems  are  defined  as  the 
frequency  where  the  magnitude  of  the  loop-transfer  function  drops  off 
to  half  of  its  dc  value.  In  this  section  we  will  see  how  apriori  out¬ 
put  feedback  affects  MAC  performance  via  the  sampling  rate  selection. 
The  effect  on  robustness  will  be  discussed  in  the  next  chapter. 

Case  1.  When  the  Plant  is  Open-Loop  Unstable: 

If  the  states  are  available  for  feedback,  then  it  is  well  known 
that  under  the  assumption  of  controllability,  the  closed  loop  poles 
can  be  placed  arbitrarily  in  the  complex  plane  using  constant-gain 
state-feedback.  But  in  the  case  of  constant-gain  output  feedback. 
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yr(n) 
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Figure  3.6  MAC  applied  to  an  apriori  compensated  plant 
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this  freedom  is  lost  and  the  poles  are  moved  accordingly  to  the  rules 
for  root  locus.  But,  as  we  know,  it  may  not  be  possible  to  stabilize 
an  unstable  plant  by  constant-gain  output-feedback  -  the  interested 
reader  may  consult  Youlafs  elegant  work  [Youla,  et.  al ,  1974]  for 
details.  In  such  cases,  the  plant  must  be  stabilized  first  by  using 
dynamic  output  feedback  before  MAC  can  be  used  on  the  overall  plant 
hf(s)  in  Figure  3.6.  However  once  a  stable  hf(s)  is  obtained  MAC 
treats  it  like  any  other  stable  plant. 

Case  2.  When  the  Plant  is  Lightly  Damped: 

If  the  open-loop  plant  has  all  the  transmission  zeros  in  the  open 
left  half  of  the  s-plane  (OLHP) ,  the  gain  k  can  be  made  high  and 
arbitrary  fast  response  can  be  obtained  without  destabilizing  the 
overall  plant  hf(s).  As  k+°°,  some  of  the  closed  loop  poles  approach 
the  finite  transmission  zeros  of  the  plant  and  the  remaining  ones 
approach  infinity.  The  limiting  dynamical  behavior  of  hf(s)  is  deter¬ 
mined  by  the  location  of  the  transmission  zeros.  If  the  system  has 
closed  right  half  plane  (CRHP)  (in  the  s-plane)  zeros,  k  can  not  be 
increased  arbitrarily. 

The  BW  of  the  overall  system  hf(s)  in  Figure  3.6  is  determined  by 
the  fastest  dynamics  which  in  turn  are  determined  by  the  poles  that 
move  toward  infinity.  Therefore  as  k+°°,  the  plant  output  must  be 
sampled  faster  and  faster  to  capture  the  dynamical  characteristics  of 
the  overall  plant  hf(s).  The  situation  is  even  worse  if  the 
transmission  zeros  are  stable  and  lie  near  the  ju>-axis.  In  this  case, 
as  k->°°,  some  of  the  closed-loop  poles  arrive  at  these  zeros  and  there¬ 
fore  hf ( s )  is  lightly  damped  again.  The  IR  of  this  system  is  composed 
of  slow  dynamics  as  well  as  of  fast  dynamics  -  the  modes  corresponding 
to  slow  dynamics  make  the  impulse  response  of  hf(s)  long  and  the  modes 
corresponding  to  fast  dynamics  dictate  a  fast  sampling  rate.  The  net 
result  is  that  the  IR  sequence  of  the  discretized  system  has  possibly 
many  more  terms  then  the  uncompensated  plant.  Therefore,  there  is  a 
trade-off  between  the  damping  added  to  the  system  using  output  feed¬ 
back  and  the  resulting  sampling  rate. 
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Although  the  length  of  the  impulse  response  gets  smaller  as  a 
result  of  damping  added,  we  may  keep  the  sampling  rate  unchanged  so 
that  the  number  of  terms  in  the  IR  sequence  is  smaller.  This 
obviously  deteriorates  MAC  performance.  We  illustrate  these  ideas 
with  two  simple  examples. 

3.5.2  Examples 

Example  1. 


Consider  again  the  scalar  system  of  the  last  section; 


x(t)  =  -ax( t )  +  bu( t ) ,  x(0)=xq  (3.37a) 

y(t)  =  cx( t ) ,  (3.37b) 

where  a,b,c  are  scalars.  Let  us  assume  c-l,  then  the  open-loop 
transfer  function  of  the  plant  is  hc( s)=b/(s+a) .  Although  there  is  no 
oscillatory  mode  in  this  system,  we  will  call  it  a  lightly  damped 
system  if  a-0.  Using  an  output  feedback  control  law  u=-ky+v ,  the 
closed  loop  system  is 


x(t)  =  -(a  +  bk)x(t)  +  bv( t ) ,  x(0)=xq  (3.38a) 

y ( t )  «  x(t)  (3.38b) 


and  the  closed-loop  transfer  function  hf(s)=b/(s  +  a  +  bk).  The  power 
spectrum  is 


|hf(ju>)  |2  =  - b - 

u)2  -i-  (a+k)2 


(3.39) 


Clearly  if  the  BW  wq  of  this  system  is  defined  as  the  frequency  0)q  at 
which  |hf(jw0)  |  =  p  |hf(jO)|,  where  p  is  a  constant,  then  u)q  is  given 
by 


0)Q 


/o/P2-n 


(a+k)  =  2irfQ. 


(3.40a) 
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The  sampling  time  interval  T  is  given  by 


T 


1 

2f0 


IT 


/ 


(l/p2-l)  (a+k) 


(3.40b) 


The  last  equation  shows  that  as  k  is  increased  to  add  more  damping, 
(or,  strictly  speaking,  to  get  a  shorter  duration  IR  sequence)  the  BW 
Wq  is  also  increased  and  so  does  the  sampling  rate.  The  discretized 
system  corresponding  to  (3.38)  is  obtained  via  an  exponential  trans¬ 
form  as 


x(n+l)  =  fx(n)  +  gu(n)  (3.41a) 

y(n)  =  x(n)  (3.41b) 

where 

f  =  e-(a+k)T  ancj  g  =  f-1  b  (3.41c) 

We  shall  examine  how  the  MAC  performance  varies  for  a  given  T  as  k 
changes.  Suppose  a=l  and  b=10  and  consider  the  case  for  k=0;  then  for 
a  choice  of  T=0.1  Sec,  f=0. 90484  and  g=0. 95163.  MAC  is  applied  to 
this  system  with  a  set  point  of  15.0,  a=0*l.  The  result  is  shown  in 
Figuree  3.7.  Next  k=10  is  selected.  For  the  same  value  of  T,  the 
discretized  system  parameters  are  f=0*332871  and  g=0*60648.  The 
result  of  applying  MAC  to  this  system  is  shown  in  Figure  3.8.  Notice 
the  difference  between  the  control  efforts  in  the  two  cases.  In  the 
later  case,  the  same  sampling  interval  of  T=0*1  secs  has  captured  less 
dynamical  characteristics  than  the  earlier  case  and  the  controller  has 
spent  more  control  effort  in  the  stady-state  tracking. 

Example  2. 


Next  consider  the  decoupled  longitudinal  dynamics  of  an  air-to-air 
missile  (cf.  AFWAL-TR-80-3125)  [1]. 

o\  u(t) 


x(t)  = 


-1.4868  1.00 

-149.93  0 


x(t)  + 


-281.1 1 
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Figure  3.7  MAC  applied  to  a  lightly  damped  system,  T=  0« 1  secs. 
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Figure  3.8  MAC  applied  to  a  system  after  damping  is  added  to  a  lightly 
damped  system,  for  the  same  sampling  interval 
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Figure  3.9  MAC  applied  to  longitudinal  dynamics  of  air-to-air 
missile,  T  =  0«1  secs. 


6.00 


6.00 


3-?0 


y(t)  -  (l  o)x(t) 


where 


xj(t)  =  angle  of  attack  in  radian 
*2(t)  =  perturbed  pitch  rate  (rad/sec) 
u(t)  =  elevator  angle  (rad) 


The  eigenvalues  are  at  92  =  “0 . 7434±j 12 . 22 


The  damping  ratio  £  - 


1.4868 


-  0.061 


/|l.48682  -  4  149.93 1 


The  system  is  lightly  damped  with  a  natural  frequency  of  1.95  Hz. 
Therefore,  the  output  must  be  sampled  at  least  every  1/4  sec.  Using 
negative  feedback  of  the  output,  the  closed  loop  system  is  given  by, 


x(t) 


y(t)  =  (1  0)  x  (t) 


The  damping  ratio  of  the  closed  loop  system  can  be  found  as 


1.4868 


/ 


(1.4868)2  -  4(149.93  +  281.11k)  | 


Clearly  for  k  >  0,  £0  *  As  k  increases,  the  system  approaches 

being  undamped  and  accordingly  the  sampling  rate  decreases  up  to  about 
k=0*531  when  the  system  becomes  critically  damped.  As  k  increases 
further  both  poles  are  real  -  one  mode  becomes  fast  and  the  other  mode 
slow  thus  making  the  IR  even  longer  until  k=0*533  when  the  system  is 
marginally  stable. 

As  in  the  last  example,  T=0.1,  set  point  =  15.0  and  a=0.1  is 
selected.  The  result  of  application  of  MAC  to  the  uncompensated 
plant,  i.e.  k=0 ,  is  shown  in  Figure  3.9.  Now  when  k=0- 53289,  the  com¬ 
pensated  system  is  sampled  in  sampled  at  T=0.1  secs,  and  MAC  is 
applied  to  the  discretized  system  at  this  sample  rate.  The  result  is 
shown  in  Figure  3.10.  As  in  the  last  example,  the  control  effort 
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MAC  applied  to  air-to-air  missile:  after  using  apriori 
output  feedback,  but  at  the  same  sampling  interval  of Q  i 
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needed  to  keep  it  in  the  right  trajectory  is  larger  than  for  uncompen¬ 
sated  plant. 

3.6  Conclusion 


The  main  contribution  of  this  chapter  is  the  description  of  MAC  for 
Multivariable  system  in  section  3.2  where  it  has  been  shown  that  the 
classical-controller  interpretation  of  MAC  can  be  extended  to  M1M0 
systems.  This  interpretation  of  MAC  will  help  the  designer  to  apply 
the  recently  developed  robustness  analysis  tool  to  MIMO  MAC.  Another 
important  contribution  of  this  chapter  is  the  description  of  MAC  using 
the  rational  transfer  function  (or  difference  equation)  model  of  the 
plant  -  MAC  can  then  be  interpreted  in  a  root-locus  framework  and 
explained  using  traditional  pole  and  zeros  of  a  rational  transfer 
function.  Finally  in  section  3.5,  the  effect  of  apriori  analog  com¬ 
pensation  on  the  MAC  performance  has  been  investigated  qualitatively. 
It  has  been  found  that  if  the  addition  of  output  feedback  creates  a 
faster  mode  than  in  the  uncompensated  plant,  the  sampling  rate  must  be 
increased  accordingly  to  capture  the  dynamical  characteristics  of  the 
compensated  plant.  Otherwise  MAC  performance  will  deteriorate. 
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CHAPTER  4 


ROBUSTNESS  ANALYSIS  OF  MAC 


4. 1  Introduction 

Any  model  of  the  plant  is  almost  invariably  different  form  the  actual 
plant  for  many  reasons.  For  the  purpose  of  synthesizing  a  finite  dimen¬ 
sional  controller,  the  plant  is  modelled  as  finite  dimensional  even  though 
the  plant  may  be  of  a  distributed  nature  or  may  have  delays  embedded  in  it. 
Usually  the  high  frequency  part  of  a  plant  is  neglected  and  the  model 
emphasizes  the  low  frequency  behavior  of  the  plant.  Even  though  a  plant 
has  been  modelled  accurately  in  the  past,  low  frequency  error  is  introduced 
eventually  due  to  aging,  deterioration  etc.  On  the  other  hand  a  control 
law  is  designed  on  a  nominal  model  and  implemented  on  the  actual  plant. 

The  nominal  control  law  therefore  must  be  robust  enough  to  ensure  the  per¬ 
formance  level  for  the  actual  plant.  The  purpose  of  robustness  analysis  is 
to  examine  the  range  of  the  nominal  control  law  maintaining  the  closed-loop 
stability  and  performance  level  for  all  the  plants  around  the  nominal 
model.  The  classical  designers  measure  the  robustness  (with  respect  to 
stability)  of  a  nominal  control  law  by  its  gain-margin  (GM)  and  phase- 
margin  (PM).  In  this  chapter,  the  robustness  of  the  MAC  control  law  will 
be  studied  from  the  viewpoint  of  a  classical  controller  and  therefore . MAC 
must  be  modelled  as  a  classical  controller.  We  have  already  developed  a 
model  of  MAC  of  this  type  in  the  preceeding  chapters  which  we  summarize 
here  again  briefly.  For  simplicity  of  analysis,  we  shall  consider  SISO 
plants  only.  The  MIMO  plants  are  described  in  Larimore,  Mahmood ,  and  Mehra 
(1984). 

This  chapter  is  organized  as  follows.  The  MAC  model  developed  in  the 
previous  chapter  is  briefly  reviewed  in  Section  4.2  —  this  model  is  the 
basis  for  all  subsequent  analysis  of  robustness.  Classical  gain  margin 
(GM)  and  phase  margin  (PM)  for  MAC  are  analyzed  in  Section  4.2.1.  The 
robustness  in  terms  of  GM  and  PM  can  handle  a  limited  class  of  plant  per- 
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turbations;  therefore  a  more  generalized  class  of  perturbations  are  charac¬ 
terized  and  robustness  evaluated  in  Section  4.2.2.  Since  a  rational 
transfer  function  model  usually  has  far  fewer  numbers  of  parameters  than  in 
the  impluse  response  (1R)  description  of  the  plant,  a  robustness  result  is 
derived  for  such  models  in  Section  4.2.3.  A  simple  analytical  example  is 
presented  in  Section  4.3.  Finally  the  chapter  is  concluded  in  Section  4.4. 


4 . 2  Review  of  MAC  Model  for  Robustness  Analysis 

Let  us  recall  that  under  some  simplifying  assumptions,  MAC  can  be 
modelled  as  in  a  classical  control  framework.  The  underlying  assumptions  are: 

(i)  the  actual  plant  h(z)  is  minimum  phase 

(ii)  there  are  no  input  constraints,  i.e.  ft(i)  =  R  for  all  i, 
where  R  is  the  real  line 

(iii)  the  optimization  is  carried  over  one  future  step  ahead  i.e. , 

(T  =  1);  under  this  condition  MAC  is  a  one-step  ahead  pre¬ 
dictive  controller 

The  transfer  functions  under  the  MAC  control  law  for  MIMO  plants  have  been 
developed  in  Equations  (3.7a)  and  (3.7b).  The  corresponding  quantities  for 
SISO  plants  are: 

u(z)  =  _  1  -  a _  (4.1a) 

c(z)  (z-l){i(z)  +  (l-a)h(z) 

y(z)  _  h( z ) ( 1-a) _  (4.1b) 

c(z)  (z-l)fi(z)  +  (l-a)h(z) 

Equations  (4.1a)  and  (4.1b)  imply  that  MAC  under  assumptions  (i)-(iii) 
is  equivalent  to  the  classical  unit  feedback  configuration  of  Figure  3.2 
in  an  input-output  sense.  The  figure  is  again  reproduced  in  the  following 
for  convenience: 
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compensator 


( 


1 


Figure  4.1.  MAC  as  a  Classical  Controller 

This  interpretation  of  MAC  is  the  basis  of  our  analysis  of  MAC  in  the  fra¬ 
mework  of  classical  control. 

4.2.1.  Phase  and  Gain  Margins 


The  block  within  the  dashed  line  can  be  considered  as  a  dynamic 
controller  of  the  classical  type.  The  loop  transfer  function  l(z)  at  point 
1  is 


T _  h(z)( 1-a)  (4.2a) 

Uz;  '  h(z)(z-l) 

and  the  return  difference  function  is 

i  +  ir^'1  -  fc(z>(z-l)  +  h(z)(l-a)  (4.2b) 

+  Uz;  h(z)(z-l) 

Note  that  since  we  are  dealing  with  a  SISO  loop,  the  loop  transfer  function 
at  any  point  of  theloop  is  same.  For  MIMO  loops,  the  loop  transfer  func¬ 
tion  depends  on  the  point  where  the  loop  is  broken  because  of  the  non¬ 
commutativity  of  matrices.  However,  in  this  case  the  error  y(z)  in 
tracking  e(z)  =  c(z)  -  y(z)  is  given  by 

e(z)  =  (1  +  l(z))~l  c(z) 

so  that  the  steady  state  error  due  to  a  step  input  is 

ess(t)  =  lim  (l+l(z))-l  =  (1  +  1(1))  =  0 
z-1 
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whether  the  model  is  exact  or  not.  This  is  a  consequence  of  a  built-in 
integrator  in  the  compensator. 

It  may  be  noted  from  Figure  5  that  at  point  2,  u(z)  =  yr(z)  when 
h(z)  =  ft(z),  where  yr(z)  is  the  reference  signal.  In  this  case  the  input 
u(z)  to  the  actual  plant  is  generated  as  u(z)  =  yr(3)/h(z)  and  therefore 
y(z)  =  h(z)u(z)  =  yr(z).  This  shows  why  perfect  tracking  is  possible 
under  perfect  identification.  We  will,  however,  now  pursue  this  issue 
further. 

It  is  obvious  from  Equations  (4.1)  and  (4.2)  that  the  closed-loop 
system  is  internally  asymptotically  stable  if  the  roots  of  the  rational 
function 

4>cl(z)  =  (z-l)ft(z)  +  (l-a)h(z)  (4.3) 

are  within  the  open  unit  disk  z  <  1,  and  these  roots  are  also  the 
roots  of  the  return  difference  function  1  +  l(z).  We  can  therefore 
find  the  stability  margin  in  terms  of  the  gain  margin  (GM)  and  phase 
margin  (PM)  from  the  Bode  plot  or  Nyquist  plot  of  the  loop  transfer 
function  l(z)  evaluated  on  the  contour  z  =  exp  (juj)  appropriately 
indented  around  the  poles  on  this  contour.  Recall  that  in  continuous¬ 
time,  the  GM  and  PM  are  those  values  of  k  and  <j>  respectively  such  that 
the  perturbed  loop  l(s)  =  kexp( j <{>)1( s )  is  stable,  where  l(s)  is  the 
nominal  loop  and  s  is  the  Laplace  variable.  A  similar  interpretation 
goes  for  the  discrete-time  systems  (Kuo(1980));  but  the  PM,  unless  it 
is  an  integral  value  of  the  sampling  interval,  does  not  have  any  phy¬ 
sical  significance.  Strictly  speaking  the  complex  constant  kexp(j<()) 
in  continuous  time  should  be  replaced  by  kz”n,  n  an  integer,  for 
measuring  GM  or  PM  of  the  discrete-time  system. 

Another  way  to  compare  with  other  continuous-time  domain  design 
techniques  is  that  each  element  of  the  discrete-t ime  loop  should  be 
transformed  into  an  equivalent  continuous-time  element  using  the  bilinear 
transformation,  and  PM  of  the  fictitious  continuous-time  loop  can  be  taken 
as  the  PM  of  the  discrete-t ime  loop.  In  this  paper  the  word  PM  is  used  to 
mean  the  continuous-time  equivalent  phase  margin.  We  can  now  state 
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Theorem  4. 1 


Under  assumptions  (i)-(iii),  MAC  has  GM  =  (0,  2/(l-a)),  equivalent 
PM  -  Cos”1  (l-a)/2,  and  unity  gain  cross-over  frequency  o>0  = 

2sin”1  (l-a)/2. 


Proof 


The  proof  is  trivial  if  we  recall  that  PM  and  GM  are  measure  on  a 
nominal  loop.  Here  we  can  assume  that  the  nominal  plant  h(z)  =  R(z), 
which  implies  h^  =  and  N  =  N  because  both  h(z)  and  R(z)  are  power 
series  in  z“ 1 .  This  nominal  loop  transfer  function  from  (4.2a)  is 
then 


l(z) 


1-a 

z-1 


(4.4) 


i.e.  an  integrator  delayed  by  one-step.  Evaluating  on  z  =  exp(jco),  we 
get 


l(exp(  ju>)) 


1-a 

2 


j 


1-a 

2 


(4.5) 


and  | l(exp( jwO) |  =  1.0  implies  that  unity  gain  cross-over  frequency  at 


ooO  =  2  sin”1 


(4.6) 


The  Nyquist  plot  of  the  discrete-time  loop  in  Equation  (4.5)  is  quite 
simple  and  from  the  plot  it  is  easy  to  see  that  the  system  is  stable  for 
all  gains  in  the  interval  (0,2/ 1-a),  and  a  pure  delay  $  =  90°  -Sin”1 ( 1-a) /2 
will  change  the  number  of  encirclements  by  the  Nyquist  contour,  thus  making 
the  system  unstable. 


To  get  the  equivalent  PM  we  transform  each  element  of  the  loop 
using  the  bilinear  transformation  s  =  (z-l)/(z+l)-1  to  get  the  equiva¬ 
lent  continuous  loop 

From  the  Nyquist  plot  of  l(s)  it  is  obvious  that  GM  (0,2/(l-a))  (same  as 
found  by  analyzing  the  discrete- time  Nyquist  plot)  and  a  PM  =  Cos” 1 ( 1-a) /2 . 
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Theorem  1,  although  very  simple,  reveals  some  intuitively 
appealing  results  about  GM  and  PM  of  MAC.  We  can  make  the  following 
remarks. 

Remarks 

1)  Since  ae[0,l],  the  guaranteed  upward  GM  is  2  and  the  PM  is 
60°. 

2)  We  can  always  trade-off  robustness  against  the  speed  of 
response.  As  response  speed  is  increased  by  decreasing  a,  BW 
(jJ0  -  2sin“l  (L-a)/2  increases  (which  makes  sense)  with  a  con¬ 
sequent  reduction  of  robustness  in  terms  of  GM  and  PM. 

3)  We  get  this  remarkable  PM  even  though  MAC  is  an  output- 
feedback  controller  possibly  because  the  plant  is  inverted 
causally  through  the  use  of  an  optimization  algorithm  in  the 
sense  that  at  each  time  the  algorithm  provides  the  controller 
with  the  entire  future  input  sequence.  For  the  same  reason, 
the  discrete-time  loop  has  a  one  pole  roll-off  for  all  fre¬ 
quencies  -  which  is  rather  unusual. 

4)  Theorem  I  ensures  that  the  controller  can  stabilize  the  loop 
for  all  the  plants  {hj_}  belonging  to  the  set 

thilhi  =  k^i>  i=l,...,N,  ke(0 ,2/ ( 1-a) ) } . 

4.2.2.  Plant  Robustness  Analysis  for  Generalized  Perturbations 

The  nominal  model  R(z)  is  usually  different  from  the  actual  plant 
h(z)  for  various  reasons.  Sometimes  ft(z)  is  deliberately  made  simple 
to  facilitate  the  control  computation  by  retaining  the  modes  in  the 
active  frequency  range.  On  many  occasion  it  is  difficult  to  model 
high  frequency  modes,  and  these  are  simply  neglected.  Due  to  ageing, 
etc. ,  the  modes  of  the  actual  plant  drifts  slowly  thus  introducing 
low-frequency  error.  Thus  the  modeling  error  e(z)  has  in  almost  every 
case,  a  dynamic  structure;  and  the  information  about  e(z)  must  be 
incorporated  in  designing  a  nominal  loop.  As  a  minimum  amount  of 
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information  e(z)  is  expressed  as  an  upperbound  on  |e(exp(jw)|;  and  the 
purpose  of  robustness  analysis  is  to  find  a  requirement  on  the  nominal 
loop  interms  of  this  upperbound  so  that  the  closed  loop  performance 
and  stability  is  maintained  in  the  face  of  modeling  uncertainty. 

Usually  the  admissible  uncertainties  are  expressed  in  two  ways: 
additively  or  multiplicat ively .  If  we  take  fr(z)  as  the  nominal  plant, 
then  in  an  additively  uncertain  model,  we  express  the  actual  plant 
h(z)  as 

h(z)  =  R(z)  +  Aha( z )  (4,8) 

and  in  a  multiplicatively  uncertain  model,  the  actual  plant  h(z)  is 

h(z)  =  R(z)(l  +  Aftm(z))  (4.9a) 

or 

h(z)  =  h(z)  Ahm(z)  (4.9b) 

For  single-loop  systems  the  order  of  multiplication  in  (4.9)  is  not  rele¬ 
vant,  but  for  MIMO  cases  the  order  is  important  because  of  the  non¬ 
commutativity  of  matrices  where  input  channel  (left)  uncertainty  and 
output-channel  (right)  uncertainty  must  be  distinguished.  Both  of  the 
multiplicative  forms  in  (4.9)  are  often  used  in  analysis,  but  in  this  paper 
we  shall  be  using  (4.9b).  Note  that  at  nominal  values  of  the  plant,  Aha(z) 
=  Ahm  (z)  =  0  and  Ahm  (z)  =  1.  Also  note  that  the  classical  GM  and  PM 
ensures  the  stability  of  a  perturbed  plant  of  the  form  (4.9b).  If  the  GM 
is  k,  then  Ahm(z)=k,  and  if  the  PM=n  (in  the  sense  of  discrete-data 
system),  Ahm( z )=z_n.  These  are  undoubtedly  a  limited  class  of  allowable 
perturbations  and  we  must  consider  other  possible  error-structures  in 
designing  the  nominal  loop.  The  framework  of  (4.8)  and  (4.9)  is  more 
general  in  the  sense  that  it  can  handle  a  constant,  nonconstant  and  even 
dynamic  model  mismatch  (say  for  example  unmodelled  poles,  etc.).  Let  us 
rewrite  ft(z)  and  h(z)  as 

N 

R(z)  =  L  friz-1  =  z~^fip(z)  (4.10a) 

i=  1 

N 

where  Rp(z)  =  ^  R^z^-i  =  a  polynomial  in  z, 

i=l 
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and 


h(z)  =  z  N  hp( z) , 

N 

h  (z)  =  l  hizN'i  (4.10b) 

F  i=l 

then  by  straightforward  manipulation,  the  closed  loop  characteristics 
polynomial  is 

^cl,p  (z)  =  zN(z-l)hp(z)  +  z^  ( l-a)hp(z)  (4.11) 

with  p  denoting  that  we  are  considering  the  polynomial  part  only. 

For  closed-loop  stability,  4>cl,p(z)  must  have  all  the  roots  strictly 
inside  the  unit  disk  |z|=l.  For  perfect  identification  N=N , 
hp(z)=hp(z),  and  <J>C^ y  p( z )  =zN(  z-a) hp( z ) .  Of  course  the  zeros  of  Rp(z) 
will  be  cancelled  eventually  leaving  the  only  closed  loop  pole 
at  z =ot.  However  N,  the  order  of  the  true  plant,  is  usually  unknown,  and 
therefore  in  real-world  situations  (4.11)  can  not  be  evaluated.  The  actual 
plant  h(z)  must  be  considered  as  a  perturbation  of  the  nominal  plant  Ft(z), 
and  the  stability  conditions  must  be  derived  in  terms  of  the  nominal 
sequence  {h*}  and  the  perturbation  Aha(z)  or  Ahm(z).  Let  us  assume  that 
Aha(z)  and  Ahm(z)  can  be  expressed  as  in  (4.10),  i.e., 

Na 

Aha(z)  =  \  haiz  ^ 

i=l 

-  z~Na^hap(z)>  Ahap(z)  =  a  polynomial  in  z 

Nm 

A hm ( z )  —  ^  ^^miz  ^  =  z  ^mhmp( z )  • 
i=l 


(4 . 12a) 

(4.12b) 


although  the  following  theorem  can  be  developed  without  such  an  expli¬ 
cit  form.  Note  that  the  index  in  (4.12b)  must  start  from  0  to  accomo¬ 
date  constant  multiplicative  perturbation.  We  have  the  following 
theorem  on  robustness: 
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Theorem  4.2 


(i)  The  system  is  closed-loop  stable  for  all  additive  pertur¬ 
bations  Aha(z)  satisfying 


|Ahap(z) |  < 


/  1  -  2a  Cosoo  +  a^ 


|Kp<z) 


1-a  1 “Pv  1  (4 . 13a) 

and  z-  exp(jo)) 

(ii)  The  system  is  closed-loop  stable  for  all  multiplicative 
perturbations  Ah^Cz)  satisfying 


|Ahmp  (z)  -  z  m  |  < 


z 

-  a 

1 

-  a 

(4 . 13b) 


on  the  unit  circle  where  Ahap(z)  and  Ahj^^z)  are  given  by  (4.8). 

Proof:  The  proof  is  straightforward  if  we  express  h(z)  using  the 

form  (4- 10)— (4* 11) ,  find  the  corresponding  closed-loop  characteristic  polyno¬ 
mial,  and  finally  use  Rouch's  theorem  to  prove  (4.13)  on  the  assumption  that 
the  nominal  loop  is  internally  stable  and  hence  (z-a)hp(z)  has  all  the 
roots  strictly  inside  the  unit  disk  |z|-l. 

The  tests  of  the  type  given  in  (4.13)  are  sufficient  conditions 
and  generally  tend  to  be  conservative.  Nevertheless  we  can  make  the 
following  remarks : 

(i)  Both  tests  (4.13a)  and  (4.13b)  are  useful.  For  example  when 
an  actual  known  model  {h-^,  i=l,...,N|  is  truncated  to 
obtain  {"h±y  i=l,...,N,  N  <  n}  ,  so  that  {Ahaj_  =  h^,  i=N, 

N+1,...N  and  Ahaj_  =  0,  i  <  n}  ,  stability  around  (hjj  can  be 
obtained  from  (4.13) 

(ii)  For  constant  multiplicative  gain  mismatch,  i.e.  h^  =  kR-^ 

for  all  i,  (Ahm^  =  k  when  i=0  and  Ahmj_  =  0  when  i  >  0 }  ,  so 
that  Ahmp(z)  =  kz  m  and  test  (4.13b)  yields  that  the  system 
is  stable  for  all  k  such  that 


k  -  1  < 


z  -  a 


,  z  =  exp(ja)) 


(4.14) 
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But  it  is  easy  to  see  that  min  |exp(jo))-a|  -  1-a  so  that  (4.10) 
becomes  | k— 1 | < 1  which  implies  ke(0,2).  This  clearly  shows  that  these 
tests  are  conservative.  (See  remark  (4)  of  the  previous  section). 

It  can  be  shown  trivially  that  near  u)=0 ,  the  bound  on  the  RHS  of 
(4.13a)  is  meaningless;  almost  any  reasonable  perturbation  will  satisfy 
this  sufficiency  condition  at  low  frequencies,  but  the  above  inequality 
must  be  obeyed  for  each  we[0,H]  particularly  at  high  frequencies. 

We  note  further  that  given  any  perturbation  Ahp(jaj),  it  is  extremely 
difficult  to  come  up  with  a  stable  design  to  accomodate  it.  On  the  other 
hand,  given  any  stable  design  we  can  only  make  statements  about  the  size  of 
a  perturbation  the  design  can  tolerate,  and  perhaps  from  our  previous 
experience  we  can  change  the  nominal  design  iteratively  to  accomodate  the 
given  perturbation. 

4.2.3.  Robustness  Analysis  When  the  Plant  Model  is  Described  by  a  Rational 
Transfer  Function 

In  the  previous  section  we  analyzed  the  robustness  of  the  MAC 
control  law  for  systems  represented  by  an  impulse  response  sequence. 

In  this  section  the  analysis  will  be  carried  out  for  plants  described 
by  Difference  Equations  (DE)  -  this  will  yield  more  insight  into  the  rela¬ 
tion  between  the  robustness  of  MAC  and  the  design  parameters  embedded 
in  it. 

We  analyze  again  under  the  usual  assumptions,  viz, 

(i)  the  system  is  minimum  phase 

(ii)  the  optimizing  horizon  is  one-step  in  the  future 

(iii)  there  are  no  constraints  either  on  the  input  or  any  other 
loop  variables 

Under  these  assumptions,  the  MAC  control  law  is  given  by  equations 
( 4 . 1 a)-( 4 . 1 b)  and  the  equivalent  classical  network  is  given  in  Figure  4.1. 
Obviously  then  the  loop  transfer  function  is 

W  \  _  (l-ct)h(z)  (4.15a) 

1(z)  (z-l)h(z) 
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so  that  the  return  difference  function  is 

l  +  -i  _  ,  .  ( l-a)h(z)  =  ( z-l)h(z)  +  (l-ct)h(z) 

K  ’  (z-l)R(z)  (z-l)R(z)  (4.15b) 

Clearly  the  closed  loop  poles  are  given  by  the  zeros  of  the  numerator 
(4.15b).  We  have  shown  in  the  previous  section  that  the  MAC  control 
law  is  nominally  closed  loop  stable  for  all  values  of  a,  o<a<l. 

A  typical  Nyquist  plot  is  shown  in  Figure  4.2.  It  is  obvious  from 
the  figure  that  at  any  frequency  u)Q,  the  loop  transfer  function 
l(eja}°)  can  tolerate  a  maximum  perturbation  of  |l+l(edwo)|  and  yet  the 
Nyquist  plot  will  not  change  the  number  of  encirclements  of  the  -1+jO 
point.  This  observation  leads  to  the  following  theorem  on  additive 
perturbations. 


axis 


Theorem  4.3 

Suppose  the  loop  is  nominally  stable.  The  the  perturbed  loop  is  stable  for 
all  additive  perturbations  Al(z)  satisfying 

| Al(eJ w)  |  <  |l+l(ejw) |  (4.16) 

where  0)  varies  over  the  unit  circle  if  l(z)  is  analytic  on  the  contour 
\z\  =  1  or  over  any  suitable  indented  contour  on  the  unit  circle  to 
bypass  any  singularity  of  l(z)  on  the  unit  circle. 
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Proof :  A  heuristic  proof  should  be  obvious  from  Figure  4.2.  A 

rigorous  proof  follows  from  a  straightforward  application  of  Rouches1 
Theorem  as  in  the  previous  section. 

A  similar  theorem  can  be  developed  for  multiplicative  pertur¬ 
bations.  Theorem  4.3  gives  the  sufficiency  condition  for  stability. 

Its  usefulness  lies  in  the  fact  that  given  an  apriori  knowledge  of  a 
perturbation  that  satisfies  the  inequality  (4.16),  the  Theorem  guarantees 
the  stability  of  the  closed  loop  system  for  such  perturbation.  For 
example  if  a  high  frequency  mode  is  neglected  or  if  the  modes  are  not 
correctly  modeled,  the  discrepancy  is  expressed  in  an  additive  form  and  a 
test  of  the  type  (4.16)  must  be  carried  out  after  a  nominal  control  law  has 
been  found. 

We  can  find  a  more  specific  form  of  Equation  (4.16)  as  follows. 

Suppose  the  nomimal  (or  identified)  plant  is  R(z).  The  true  plant 
h(z)  is  assumed  to  lie  in  a  neighborhood  of  H(z),  and  suppose  h(z)  is 
an  additive  perturbation  of  fi(z).  In  this  case 


h(z)  =  Fl(z)  +  Aha(z)  . 


(4.17) 


The  designer  usually  has  a  knowledge  of  an  upperbound  on  | Aha( e J w)  | . 
The  nominal  loop  transfer  function  l(z)  and  the  nomimal  return  dif¬ 
ference  function  1  +  h(z)  can  be  found  from  Equation  (4.2).  These  are 


Kz)  = 


,  1  +  l(z)  =  £1^ 


z-1 


z-1 


(4.18) 


Let  Al(z)  be  an  additive  perturbation  of  the  nominal  loop  l(z)  when 
the  nominal  plant  fi(z)  is  perturbed  to  h(z)  as  in  Equation  (4.17). 
Then  the  perturbed  loop  transfer  function  l(z)  +  Al(z)  can  also  be 
evaluated  using  Equation  (4.2)  and  we  get 


1 ( z )  +  Al( z )  =  h(z)(l-a) 
h( z ) ( z- 1 ) 
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from  which  we  find 


Al(z)  =  Aha(z)  i-a  .  (4.19) 

h( z )  Z~1 

Therefore  using  Theorem  4.3,  we  conclude  that  the  closed-loop  is 
stable  for  all  perturbations  Aha(z)  which  satisfy 

|Aha(z)|  <  |h  (z)  |  (4.20a 

1  -  a 

on  the  unit  circle  z=exp(jw).  This  inequality  can  further  be 
simplified  to 

|Aha(ej“)|  <  £j~ +.  ^  ~  2.a  Cos  a)  |h(ja))|  ,  (4.20b) 

1-a 

which  can  be  verified  easily  by  plotting  these  functions. 

It  is  very  important  to  note  that  the  conditions  developed  in 
Theorems  4.1,  4.2,  and  4.3  are  all  sufficiency  conditions  and  not 
necessary  ones.  If  any  perturbation  Aha(z)  or  Al(z)  violates  these 
conditions,  the  closed  loop  is  not  necessarily  unstable;  on  the  other 
hand,  satisfaction  of  these  conditions  necessarily  guarantees  asymp¬ 
totic  stability  of  the  perturbed  closed-loop  provided  that  the  nominal 
closed-loop  is  stable. 


4 . 3  Examples 


In  this  section,  the  main  features  of  the  analyiss  of  the  last 
section  are  demonstrated  through  a  simple  example.  Since  the  IR 
description  contains  many  more  parameters  than  in  the  DE  description, 
we  use  a  rational  transfer  function  model  of  the  plant.  The  Theorem 
4.3  will  be  used  to  evaluate  the  robustness  against  modelling  mismatch 
of  the  true  plant. 

Example  4.1 


Consider  again  the  example  of  a  scalar  dynamic  system  of  the  last 
chapter.  Suppose  it  has  been  modelled  as 
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x(k+l)  =  £  x(k)  +  gu(k) 


(4.21a) 


y(k)  =  cx(k).  (4.21b) 

Assume  for  simplicity  that  cg=l  so  that  the  rational  transfer  func 
tion  of  the  model  is 

h(z)  =  1— ~  . 

z  -  f 

Then  if  the  true  plant  h(z)  =  £t(z)  +  Aha(z),  according  to  Equation 
(4.20b)  the  closed  loop  is  stable  for  all  Aha(z)  satisfying 

- 2 - - - - - 

|Aha(jo>)|  <  ~  Cos  a) _ 

(l-a)/l  +  ?  -  2f  Cos  a) 

Now  suppose  that  the  actual  plant  is  of  the  form 
x(k+l)  =  f  x(k)  +  gu(k) 
y(k)  =  cx(k) 

which  is  the  same  as  the  nominal  model  in  (4.21)  except  that  the  true 
mode  f  is  different  from  the  nomimal  mode  £.  Therefore 

h(z)  =  _ I _  ,  (4.23c) 

Z  “  f 

and  Aha(eJC0)  is  of  the  form 

Aha(  jto)  =  _ f  ~  1 _  .  (4.24) 

ej2u  _  ej(1)(f+f)  +  ff 

Given  a  nominal  MAC  loop  for  a  specified  £,  the  loop  is  stable  for  all 
f's  if  |Aha(jw) |  evaluated  from  (4.24)  satisfies  the  inequality 
(4.22).  We  selected  ?=  0.3  and  tested  inequality  (4.22)  for  three 
different  f!s:  f  =  0.8,  f  =  -0.3,  f  =  -0.8.  In  all  cases,  the  set 
point  =  13.0,  and  a=0.1  are  selected.  Since  £  is  the  same  for  all 
three  runs,  the  right-hand  side  of  (4.22)  is  also  the  same  and  is 
shown  as  a  thick  line  in  all  the  plots.  The  left-hand  side  of  (4.22) 
is  shown  in  dotted  lines. 


(4.23a) 

(4.23b) 


(4.21c) 


(4.22) 
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Case  1:  f  =  0.3,  f  =  0.8 


Here  the  true  plant  has  the  mode  at  0.8.  For  perfect  iden¬ 
tification  the  MAC  response  is  shown  in  Figure  4.3a.  For  an  iden¬ 
tified  plant  mode  at  ?=  0.3,  the  sufficiency  conditions  are  displayed 
in  Figure  4.36  -  which  shows  that  this  perturbation  satisfies  the  ine¬ 
quality  constraints  in  (4.22).  The  closed  loop,  therefore,  is 
guaranteed  to  be  stable  as  shown  by  the  MAC  performance  for  the  per¬ 
turbed  loop  in  Figure  4.3c. 

Case  2:  ?  =  0.3,  f  =  -0.3 

MAC  performance  for  the  true  plant  f=-0.3  is  shown  in  Figure  4.4a. 
The  sufficiency  conditions  are  displayed  in  Figure  4.4b  which  shows 
that  the  inequality  has  been  violated.  But  because  these  conditions 
are  only  sufficient,  we  cannot  say  anything  of  the  stability  of  the 
loop.  In  this  particular  situation,  the  perturbed  closed  loop  has 
turned  out  to  be  stable  as  shown  in  Figure  4.4c. 

Case  3:  ?  =  0.3,  f  =  -0.8 

MAC  performance  for  the  true  plant  at  f  =  -0.8  is  shown  in  Figure 
4.5a.  The  two  sides  of  inequality  (4.22)  are  drawn  in  Figure  4.5b, 
which  shows  that,  as  in  Case  2,  the  sufficiency  condition  has  been 
violated.  But  this  time,  the  closed  loop  is  unstable,  as  shown  in 
Figure  4.5c. 


4.4  Conclusion 


An  analytical  model  of  MAC  was  developed  in  Chapter  3;  we  have 
used  that  model  in  this  chapter  to  analyze  the  robustness  of  MAC.  The 
robustness  as  been  assessed  ina  classical  control  framework.  The 
classical  GM  and  PM  of  MAC  are  given  in  Theorem  4.1.  The  upward  GM 
can  be  increased  arbitrarily  by  slowing  down  reference  the  trajectory, 
PM  can  go  up  to  90°.  GM  and  PM  can  guarantee  stability  against  a 
limited  class  of  plant  perturbations,  therefore  a  new  framework  for 
analyzing  generalized  perturbations  has  been  developed  in  Section 
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4.2.2  and  the  main  result  in  this  direction  is  presented  in  Theorem 
4.2.  The  corresponding  analysis  for  models  described  by  rational 
transfer  functions  is  given  in  Section  4.2.3  and  the  main  result  is 
presented  in  Theorem  4.3.  Theorem  4.1  and  4.2  can  be  readily  verified 
by  plotting  transfer  functions.  These  Theorems  give  the  sets  of 
plants  in  the  neighborhood  of  the  identified  models  which  are 
guaranteed  to  be  closed-loop  stable  whenever  the  nominal  loop  is  stable. 
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Figure  4.3b  Plant  Perturbations  Satisfy  Sufficiency  Condition. 
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CHAPTER  5 


SAMPLING  INTERVAL  &  CONTROLLABILITY 


5.1  Introduction 


Sarapled-data  (SD)  systems  are  becoming  increasingly  important 
with  the  advent  of  cheap  computing  power  of  microprocessors.  Although 
these  systems  have  been  studied  for  a  long  time,  very  few  researchers 
have  explicitly  dealt  with  the  design  of  a  suitable  "sampling  time 
interval  'T'.  Almost  all  literature  dictates  a  sampling  rate 
satisfying  the  Nyquist  rate — although  the  latter  is  applicable  only 
for  band-limited  systems.  For  the  systems  with  undamped  modes,  only 
certain  discrete  values  of  T  are  excluded  (Chen,  1970)  to  guarantee 
the  required  rank  of  the  "Cont rollabiity  Matrix"  of  the  SD  systems. 
Nothing  further  is  said  as  to  what  values  of  T  should  be  chosen  once 
the  rank  condition  of  this  matrix  is  satisfied. 

A  recent  study  in  this  direction  is  by  Reid  et  al.(1979),  where  T 
is  uniquely  chosen  to  maximize  the  robustness  of  a  dead-beat  control 
law.  Although  this  is  a  significant  step  towards  the  characterization 
of  a  unique  T,  the  procedure  has  limited  application  because  not  all 
of  the  SD  systems  will  be  used  for  the  purpose  of  dead-beat  control. 
Maximizing  the  determinant  of  the  product  of  the  controllability 
matrix  and  its  transpose  are  much  discussed  in  the  literature,  but 
without  any  rational  justification. 

In  this  study  we  have  provided  a  logical  and  intuitively 
appealing  framework  for  choosing  an  optimal,  unique  T.  Our  analysis 
is  based  on  two  intuitive  ideas: 

(1)  that  the  amount  of  energy  needed  to  drive  a  discrete  system 
from  an  arbitrary  initial  state  to  the  origin  is  a  measure 
of  the  controllability  of  the  system, 

(2)  that  the  amount  of  energy  is  also  a  measure  of  the  degree  of 
effectiveness  of  various  control  components. 
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A  minimum  energy  terminal  control  problem  is  formulated  which 
explains  controllability  in  a  quantitative  framework  and  its  relation 
with  sampling  time  T.  The  solution  is  given  in  terms  of  the 
"Controllability  Grammian , "  and  a  natural  choice  of  T  is  made  by 
maximizing  the  minimum  singular  value  of  the  Grammian  matrix  over  a 
compact  interval  of  T.  The  excitation  ability  of  various  control 
components  (or  equivalently  how  effectively  each  control  component 
influences  the  dynamics  of  the  system)  depends  upon  the  relative 
orientation  between  the  space  spanned  by  left  eigenvectors  of  the 
system  matrix  and  the  range  space  of  the  input  distribution  matrix. 

It  is  extremely  difficult  to  visualize  the  interplay  between  a 
changing  T  and  the  relative  orientation  of  the  spaces.  This  has  led 
us  to  solve  the  problem  implicitly  as  a  minimum  energy  problem  where 
the  relative  orientation  changes  automatically  as  T  varies  to  provide 
optimal  effectiveness  of  the  control  components. 

Sometimes  control  components  may  have  different  costs.  We  would 
prefer,  then,  that  the  two  spaces  adjust  to  reflect  the  relative  costs 
so  that  the  system  uses  more  of  the  cheaper  controls  than  others.  We 
have  implemented  this  idea  by  introducing  an  "input-weighted 
controllability  Gramian"  matrix. 

The  above  ideas  can  be  dualized  to  find  an  optimal  T  from  the 
viewpoint  of  observability.  Here  T  is  chosen  to  minimize  the  maximum 
possible  energy  in  the  outputs  for  arbitrary  initial  states.  Since 
the  Hankel  matrix  is  the  product  of  the  controllability  and 
observability  matrix,  the  corresponding  values  of  T  can  be  deduced 
from  the  singular  values  of  the  Hankel  matrix,  too. 

In  sections  5.2  and  5.3  we  briefly  discuss  the  relation  between 
SD  systems  and  the  original  continuous  time  systems  and  the  previous 
results  on  the  controllability  of  the  SD  systems.  Section  5.4  also 
contains  a  brief  discussion  on  modal  controllability  and 
observability.  In  section  5.5  we  have  formulated  the  minimum  energy 
problem  in  the  new  perspective  for  finding  an  optimal  T.  Section  5.6 
deals  with  the  observability  issues.  Conclusions  are  given  in  section 
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5.7.  The  analysis  has  been  kept  limited  to  LTI  systems  for  the  sake  of 
clarity  although  generalization  to  time  varying  systems  are  concep¬ 
tually  straightforward. 

5.2  Problem  Definition 

Consider  a  linear  time-invariant  continuous-time  system 


x(t)  =  Ax(t)  +  Bu(t) 

(5.1a) 

y(t)  =  Cx(t) 

(5.1b) 

where  x(t)  e  Rn,  u(t)  e  Rm,  y(t)  e  RP  and  A,  B,  C  are  matrices 
of  compatible  dimensions. 

There  are  many  sampling  schemes  to  discretize  the  system  (5.1). 
We  shall  be  using  here  the  “sample  and  zero-order  hold”  sampling 
mechanism,  because  it  is  easier  to  implement  and  probably  the  scheme 
most  widely  used  in  industries.  Under  this  scheme  the  input  is  held 
constant  during  the  sampling  interval  and  the  corresponding  discrete 
system  is  given  by 


x(k+l)  =  Fx(k)  +  Gu(  k) 

(5.2a) 

y(k)  =  Hx(k) 

(5.2b) 

F  =  exp(AT) 

(5.3a) 

T 

G  =  [  /  exp(Av)dv]B 

0 

(5.3b) 

=  (F-I)  A  1b,  when  A  is  nonsingular 


H  -  C 

(5.3c) 

and  exp(AT)  is  the  transition  matrix  associated  with  (5.1).  The 
solution  of  equation  (5.2)  is  given  by 


k-1 

x(k)  =  Fkx(0)  +  Z  Fk-1-iGu(l) 

i=0 

y(k)  =  Cx(k) 

(5.4a) 

(5.4b) 

where  Fk  is  the  state-transition  matrix  of  (5.2). 
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Roughly  speaking  the  controllability  of  a  system  refers  to  its 
ability  to  steer  any  initial  condition  x(0)  at  k=0  to  the  origin  at 
k  >0  for  finite  k^  whereas  the  reachability  refers  to  its  ability  to 
steer  the  system  from  the  origin  to  any  given  state  in  finite  time* 
Because  of  the  non-singularity  nature  of  exp(AT) ,  the  notions  of 
controllability  and  reachability  in  continuous-time  systems  coincide. 
For  discrete  time  systems,  obviously  a  sufficient  condition  for  the 
system  to  be  controllable  is  that  be  non-singular  for  each  k,  i.e., 
the  system  has  the  ability  of  backward  transition  whereas  the 
reachability  is  the  property  of  the  range  space  {F^G},  k  =  0,1,.-.- 
The  controllability  can  be  checked  through  the  controllability 
Gramarian  formed  over  a  finite  horizon;  and  for  time-invariant 
systems,  a  horizon  of  n-steps  is  necessary  and  sufficient.  The  pair 
{F,G}  is  controllable  if  the  Controllability  Grammian 

k-1 

W(0 ,  k)  -  Z  F^G  GT(  F*"1)  (5.5) 

i-0 

is  non-singular  for  any  k>n.  Equation  (5.5)  also  shows  why  the 
non-singularity  of  F  is  necessary. 

In  a  sample  and  zero-order  hold  mechanism,  F  is  given  by  (5.3a) 
which  means  F  is  necessarily  non-singular  for  any  A.  Thus  the  notion 
of  controllability  and  reachability  are  the  same,  and  we  shall  be 
using  the  word  controllability  hereafter  to  denote  both  concepts. 

Sometimes  the  discretization  mechanism  (5.3a)  goes  by  the  name  of 
"exponential  transform."  It  is  obvious  from  (5.3a)  that  under  this 
mapping,  both  the  continuous-time  and  the  sampled-data  system  share 
the  same  eigenvectors  and  their  poles  are  related  through 

z±  =  exp( s^T) 

where  s^  and  z^  are  respectively  the  ith  eigenvalue  of  F  and  A.  Also 
note  that 
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/  exp(Av)dv  is  always  non- singular  even  if  A  is  singular, 

0 

because 


det [  /  exp  (Av)dv] 

0 


n 

It  Pi  *  0.0 
i“l 


where 


Pi  = 


1  -  exp(sjT) 

Si 


when  s^  t  0 


(5.6) 


T  when  Si  =  0 

Equality  (5.6)  is  obvious  from  the  Jordan  form  of  A. 


5.3  Controllability  and  Observability  of  SD  system: 

The  controllability  and  observability  of  the  time-invariant  (TI) 
sampled-data  system  is  a  well-studied  topic.  Probably  the  mostly  used 
criteria  is  the  rank  condition  of  the  controllability  matrix  C  and  the 
observability  matrix  0,  where 


C  =  [G  :  FG  :  Fn_1G] 


(5.7a) 


and 


H 

HF 


+ 

HFn_1- 


(5.7b) 


The  system  is  controllable  if  p(<g?)=n  and  the  system  is  observable 
if  p(0)=n,  where  p(A)  denotes  the  rank  of  A.  The  matrices  F  and  G 
depend  upon  T  whereas  H  does  not.  One  way  to  determine  the  role  of 
the  sampling  time  interval  T  on  the  controllability  and  observability 
of  the  system  is  to  check  the  rank  condition  of  the  matrices  in  (5.7) 
as  T  is  continuously  increased.  The  most  significant  results 
available  in  this  direction  are  contained  in  the  following  theorem 
extracted  from  [2]. 
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THEOREM  2.  L 


Assume  that  the  continuous  time  system  (5.1a)  is  controllable.  A 
sufficient  condition  for  the  discrete  time  system  (5.2a)  with 
coefficients  in  (5.3)  to  be  controllable  is  that  Im[ X^( A)-Xj ( A) ] * 
2irk/T,  k=±l,±2,...  whenever  Re[  A  j_(  A)-Xj  ( A)  ]  =0.  For  the  single-input 
case,  the  condition  is  necessary  as  well. 

We  can  make  the  following  remarks  as  a  corollary  of  Theorem  2.1: 

1.  The  conditions  are  also  sufficient  for  maintaining  the 
observability  of  the  SD  system,  because  the  pair  F,  H  is 
observable  if  and  only  if  the  pair  {F*  >  H*  }  is  controllable; 
and  the  Theorem  gives  the  condition  in  terms  of  the 
eigenvalues  of  A,  not  in  terms  of  H  or  G. 

2.  If  \j i=o i  ±  ja)j[  is  any  complex  pole  pair  of  A,  T  should  not 
be  chosen  such  that  T=kTr/o)i,  k=±l,±2....  Therefore  for  SISO 
systems,  as  T-[  is  increased,  SD  system  (5.2a)  loses 
controllability  for  as  many  values  of  T  and  their  integral 
multiples  as  there  are  complex  pole  pairs.  Obviously  by  a 
continuity  argument  we  can  say  that  the  controllability 
matrix  will  be  ill-conditioned  for  T  in  the  neighborhood  of 
these  T-j/s. 

3.  Although  not  related  to  this  theorem,  another  requirement  on 
T  to  avoid  aliasing  effects  is  that  we  must  sample  the 
system  at  a  Nyquist  rate  at  the  least.  If  {%ax=m^x  u)j_, 
then  T  should  be  selected  such  that 


T  < 


7T 

%ax 


(5.8) 


Note  that  if  we  choose  T=Tr/o)max  exactly  satisfying  the  Nyquist 
rate,  we  lose  controllability  for  SISO  systems. 
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5.4  ModaL  Controllability  and  Observability 


Theorem  (5.1)  does  not  provide  any  "quantat ive"  information  on 
the  "degree"  of  controllability  which  is  best  explained  by  modal 
controllability.  The  concepts  of  modal  controllability  and 
observability  are  old  and  can  be  found  in  any  standard  text  on  control 
theory.  In  this  subsection  we  discuss  briefly  how  the  sampling  time  T 
is  related  to  these  ideas.  Assume  for  simplicity  that  A  is 
diagonalizable.  The  modal  decomposition  of  A  is 

A  =  WAV*  (5.9) 

where  A  is  the  diagonal  matrix  containing  the  eigenvalues  A^  of  A,  W 
and  VT  are  respectively  the  matrices  containing  right-  and 
left-eigenvectors  of  A. 

If  w-£  and  v-^  are  right-  and  left-eigenvectors  respectively 
associated  with  ith  eigenvalue  A^,  then 

W  =  col  (w^ ,  w2,...wn)  (5.10a) 

T 

V1 

t  I  1 

V'  =  row  (v^  v2,  ...  vn)  (5.10b) 

and 

WV'  =  V'W  =  In 


Then  the  modal  decomposition  of  F  is 
F  =  W  exp(AT)V'  =  W  AF  V’ 


where  {Ap}  =exp  X^T  =  zF,  the  i-th  mode  of  the  SD  system  (5.2).  By 
straightforward  calculation,  (5.4)  simplifies  to 


x(k) 


n  ,  k- 1  n  , 

E  (z±)k  (vix(0))  w^  +  E  E  (zi)k_1  1  (vjG)u( i)wj 
i=l  i=0  j=l 

(5.11a) 
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yOO  -  Hx(  k) 


X  '  1  * 

E  (z^*  (vix(0))Hwi  +  E  E  (zi)k-1_i(viG)u(i)Hw(j ) 

i=l  i=0  j=l 

(5.11b) 


It  should  be  apparent  from  (5.11)  that  it  is  the  row  vector  (vjG) 
that  determines  whether  the  control  at  the  i-th  instant  u(i)  will  have 
any  influence  on  the  j-th  mode  of  the  system.  If  this  row  vector  is 
identically  zero  for  any  j,  i.e.,  if  v j e  left  ker  (G)  then  the  j-th 
mode  is  uncontrollable  and  the  component  of  the  state  in  the  subspace 
spanned  by  j-th  eigenvector  cannot  be  controlled.  Similarly  if  gk  is 
the  k-th  column  of  G,  then  mj^  =  <vj ,gk>  determines  the  sensitivity  of 
the  k-th  component  of  the  control  u^  on  the  j-th  mode.  In  particular, 
if  we  form  the  n  x  m  matrix  M  =  {raj^} ,  j=l,...n,  k=l,...m,  where 

M  =  V'G  (5.12) 


we  can  deduce  the  controllability  as  well  as  the  "degree  of 
controllability"  of  various  imput-coraponents  from  the  entries  of  M.  M 
is  called  the  modal  controllability  matrix.  To  increase  the 
sensitivity  of  the  k-th  control  on  the  j-th  mode,  we  should  design  g^ 
as  much  collinear  with  vj  as  possible.  It  is  easy  to  show  that  M  is 
related  with  C  in  (5.7a)  as 


C  =  W[M  :  AfM  : 


n-1 

Af  M] 


(5.13) 


where  W  is  the  matrix  of  right  eigenvectors  as  defined  in  (10),  and  if 
any  row  of  M  is  identically  zero  then  the  controllability  matrix  C 
becomes  rank  deficient. 

For  the  zeroeth  order  sample  and  hold  (S&H)  mechanism  under 
consideration 

T  T 

M  =  V'  /  exp  (Ax)dxB  =  (  /  exp( Ax)dx)v' B  (5.14) 

0  0 

Now  V'B  and  A  are  predetermined  by  the  continuous-time  system  (5,1a). 
The  only  variable  here  is  T  which  can  be  adjusted  to  regulate  the 
elements  of  M. 
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Following  the  same  argument  as  above  we  can  deduce  from  equations 
(5.11)  that  the  degree  of  the  modal  observability  is  given  by  the 
modal  observability  matrix  N  where 


N  =  HW 


(5.15) 


and  the  observability  matrix  in  terms  of  N  is 


0 


N 

NAp 


Vf 


*n-l 

_  Np  _ 


(5.16) 


Since  H=C  and  W  is  predetermined  by  the  continuous-system,  N  is 
not  affected  by  T  ,  i.e.,  the  modal  observability  matrix  of  a 
discretized  system  is  the  same  as  in  the  continuous-time  system 
although  the  observability  matrix  0  in  (16)  is  dependent  on  T.  This 
shows  that  the  sampling  time  T  will  have  more  impact  on  the  "degree  of 
controllability"  than  on  the  "degree  of  observability"  because  T 
influences  both  the  system  matrix  F  and  input  matrix  G  forming  the 
controllability  matrix. 

5 . 5  Sampling  Time  to  Maximize  the  Degree  of  Controllability 

In  this  section  we  formulate  a  minimum  energy  terminal  control 
problem  for  the  discretized  system  and  explain  why  this  scalar  measure 
can  be  naturally  taken  as  a  "degree  of  controllability."  Finally  we 
choose  T  to  optimize  this  scalar  measure.  Recall  that  a  controllable 
discrete  system  can  be  driven  to  zero-state  from  any  initial  state  in 
n-steps  which  motivates  an  optimization  horizon  of  n-steps.  Consider 
then  the  minimization  of  the  cost  functional 

i  n_1 

min  j(x(0))=  ~  E  u' ( i)R(i)u( i) ,  R(i)=R'(i)>0  (5.17a) 

(u(i) ,i=l , . .n}  i=l 

subject  to 
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x( i+1 )=Fx( i)  +  Gu(i) 
x(0)  given,  and  x(n)=0 


(5,17b) 


Obviously,  if  the  modes  are  sensitive  to  the  control-components, 
the  system  can  be  driven  to  (n)=0  from  (0)  with  lower  expense  of 
input  energy  than  if  the  modes  are  insensitive  to  control  components. 
This  fact  is  reflected  in  the  construction  of  J.  The  relative 
orientation  between  the  left  eigenspace  of  F  and  the  range  space  of  G 
(or  equivalently  the  elements  of  M)  and  the  elements  of  F  are  adjusted 
automatically  while  minimizing  J.  Note  also  that  the  relative  cost  of 
various  input  components  can  be  reflected  through  the  weighting  matrix 
R,  which  possibly  may  be  time  varying. 

The  minimization  in  (5.17a)  can  be  carried  out  using  the 
ordinary-least-square  technique  or  using  Linear-Quadratic  (LQ)  theory 
from  modern  control,  although  we  shall  be  using  the  latter  to  get  a 
better  perspective  of  the  problem.  The  Hamiltonian  sequence  H(i) 

H(i)=uT (i)R(i)u(i)  +  pf (i+1) [Fx(i)  +  Gu(i)]  (5.18) 

where  p(i)  is  the  sequence  of  Lagrange  multiplier. 

The  necessary  condition  of  optimality  gives  [3], 

x( i+1 )  =  Fx(i)  +  Gu(i)  (5.19a) 

p(i)  =  Ff  p( i+1 ) ,  i=0 > 1 , . . • n- 1 

subject  to  a  given  x(0)  and  x(n)=0>  and  the  optimal  control  sequence 
is  given  by 

u(i)  -  -R-i(i)G’p(i+l)  (5.19b) 

Solving  in  terms  of  p(0)(note  that  F  is  non-singular  in  our  case) 
and  matching  the  boundary  values  of  x(i)  at  i=0  and  n,  we  get, 
successively , 

p(i)  =  (F’)~ip(0) 

u(i)  =  -R“1(i)GT (Fl)“i~ip(0) 
p(0)  =  Wl(0,n)x(0) 
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W(0,n)  = 


(5.20) 


n-1 

E  F“i-^GR- ^(i)G'(F' )-i_1 
i=0 

Here  W1 (0 ,n)=W(0 ,n)  is  the  usual  controllability  Grammian  except  that 
it  is  weighted  by  a  sequence  R(i),  and  consequently  W(0,n)  in  (5.20) 
may  be  called  "Input-Weighted-Controllability  Grammian."  The  optimal 
control  sequence  is 

u*(i)=-R_1(i)G' (F)-1-1W-1(0,n)x(0) ,  i=0,...n-l  (5.21) 


and  the  optimal  cost  J*  is 


n-1 

J*  =  E  u*' (i)R(i)u*(i)  =  x'(0)W-l(0,n)x(0) 
i=0 


E  (1/oi)  ci 
i=l 

(5.22) 


where  a^=i-th  eigenvalue  of  W(0,n) 

ci  =  <x(0) >ui^  =  projection  of  x(0)  on  i-th  orthonorraal 
eigenvector  of  W(0,n) 

Remarks : 

1.  W! (0 ,n)=W(0 ,n)  and  is  positive  definite  if  the  system  is 
controllable.  If  the  system  is  not  controllable  01=0  for  at  least  one 
i  and  therefore  infinite  energy  is  required  to  bring  the  initial  state 
x(0)  to  zero,  which  makes  sense  physically. 

2.  WT (0 ,n)=W(0 ,n)  >  0  which  implies  the  aj/s  are  also  the 
singular  values  of  W(0,n). 

3.  Since  the  matrices  F,  G  depend  on  T  (the  sampling  time),  the 
Oj/s  and  consequently  the  minimum  J  are  dependent  on  T.  As  we  have 
seen  from  theorem  2.1,  as  T  increases  from  zero  to  infinity,  the 
discretized  SISO  system  loses  controllability  around  T^tt/u^ ,  making 
some  Oj_  equal  to  zero  and  hence  J*  in  (22)  goes  unbounded.  For  other 
values  of  T,  the  o-j/s  are  non-zero  and  finite  and  J*  is  also  finite. 

4.  The  use  of  the  matrix  R(i)  weights  the  share  of  various 
control  components  in  the  minimum  energy.  The  cost  of  various  control 
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components  can  be  reflected  through  R(i).  For  the  single-input  case, 
the  use  of  R(i)  is  superfluous  and  can  be  set  equal  to  1, 

5.  Note  that  an  equivalent  controllability  Grammian  ft(0,n)  can 
be  formed  from  the  controllability  matrix  ^  weighted  by  the  sequence 
R(i)  as  follows: 

ft(0,n)  =  diag  (R(i)Ytf' 
n 

«  l  Fn-iG  R~ 1 ( i )  G'(F')n_i  (5.23) 

i=l 


Although  rank  [ t?(0  ,n)  ]  =  rank  [W(0,n)],  the  singular  values  are 
different.  For  this  modified  controllability  Grammian  tJ(0,n),  matrix 
inversion  of  F  is  not  needed. 


We  are  now  in  a  position  to  find  an  optimal  T  on  a  rational  basis 
The  maximum  possible  normalized  energy  is 

*  J*  x,(0)W“l(0,n)x(0) 

— ,  7  tt — 7—7“  -  max  — -  - 

x(0)eRn  X  (0)X(0)  x(0) 


JN  =  max 


x'CO)x(O) 


-  ||vrl(0,n)||2  -  aOrl(O.n))  ■ 


(5.24) 


where  ||.  |  1 2  denotes  the  induced  Euclidean  norm  and  <*(•)>£(•)>  is  the 
maximum  and  minimum  singular  value  respectively.  From  (5.22)  it  is 
obvious  that  J*  is  bounded  above  and  below  as 

Tb  Mx(0)|  I2  <  J*  <oTlr  M*(0)||2  (5.25a) 


0  < 


_J _ 

o(W) 


<  JN  < 


Jl _ 

a(W) 


(5.25b) 


where  W(0,n)  has  been  denoted  by  W  for  the  sake  of  brevity. 

* 

A  rational  choice  of  T  is  to  minimize  the  upperbound  of  as 

much  as  possible,  i.e.,  the  optimal  T=T*  should  be  such  that 
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(5.26a) 


T*  =  Inf  JN 
T 

★ 

Since  JN  is  bounded  below  by  zero  and  we  shall  be  working  with  a 
compact  interval  l=[0,t],  (5.26a)  is  equivalent  to 


T*  =  min  J^j  -  rain 
Tel  Tel 


o(  W) 


(5.26b) 


Therefore  the  complete  procedure  of  obtaining  T*  is 


n-1 

T*  =  min  max  min  E  uf(i)  R(i)u(i)  (5.27) 

Tel  x(0)eRn  u(i)eRm  i=0 

l>(0)  I  1  =  1 

subject  to 

x( i+1)  »  Fx(i)  +  Gu(i),  x(0)=x(0) 

T 

F  =  exp(AT),  G  =  (  /  exp(As)ds)B 

0 

Note  that  when  the  system  loses  controllability,  then  for  some  Tel, 
o(W)-*0,  or,  l/c[(W)  blows  up.  So  for  computational  and  plotting 
purposes  we  may  as  well  evaluate  (5.26b)  as 

T*  =  max  cr(W)  (5.28) 

Tel 

It  is  conjectured  by  many  practitioners  that  T  should  be  chosen 
to  maximize  the  determinant  of  where  is  the  controllability 
matrix  in  (5.7a)  without  any  rational  justif ication.  We  explain  here 
why  this  determinant  of  is  not  a  good  measure  of  the  quantitative 

controllabiity  ideas  developed  herein.  Recall  from  remark  (5.5)  above 
that  if  R(i)=Im  for  all  i, 

W  =  &(0,n) 

and 

W(0,n)  =  F~n  W(0,n)(F' )-n,  F  =  exp(AT) 
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Therefore 


det  (CCT )  =  det(ft(0 ,n) ) 


a 


n  ot( W(0,n)) 
i=l 


(5.29) 


a 


II  exp(-2Re[  XjJnT) 
i=l 


under  the  assumption  that  A  is  diagonalizable  with  eigenvalues  X-[. 
Expression  (5.29)  clearly  shows  the  inadequacy  of  the  determinant 
criteria,  because  for  T,  where  the  system  almost  loses 
controllability,  the  denominator  of  (5.29)  is  fixed  and  a(W(0,n))  is 
nearly  zero.  According  to  criteria  developed  herein,  the  system  is 
nearly  uncontrollable.  Yet  det (<&&?)  may  be  large  if  the  remaining 
singular  values  are  large;  thus  the  "almost  uncontrollability” 
situation  of  the  discretized  system  remains  undetected  with  the 
determinant  criteria. 

Examples : 

Example  5.1.  Consider  a  SISO  continuous  system 


0 


ij 


The  poles  are  at  -2±j3,  with  a  Nyquist  sampling  rate  T^yq^ 1*04719 
sec.  The  o(W(0,2))  as  a  function  of  T  is  plotted  in  figure  5.1,  which 
rightly  shows  that  at  T=1.047  sec,  the  system  loses  controllability. 

To  avoid  aliasing  effects  we  must  choose  T  smaller  than  the  Nyquist 
sampling  rate,  and  as  seen  from  the  plot  the  optimum  T-0.65  second. 

Note  also  that  near  T=T^yq>  the  degree  of  controllability  is  poor. 

Example  5.2.  As  another  example  consider  the  decoupled  longitudinal 
dynamics  of  a  missile  in  flight  condition  1: 
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x(t) 


(-1.4868 
-149.93 


x  - 


X1 

x2 


uoo\  /  0  \ 

xft)  +  u(t) 

[)  /  \-  28W  n  / 


where  x^(t)  =  angle  of  attack  in  rad 

x^Ct)  =  perturbed  pitch  rate  rad/sec. 
u(t)  =  elevator  angle 

The  poles  are  at  -0 .7434±j 1 12 . 22  with  a  damping  ratio  5=0.061  and 
a  Nyquist  sampling  interval  rate  T^yq=0 .257sec.  _a(W(0,2))  plot  is 

given  in  figure  2  which  shows  that  the  system  loses  controllability  at 
T  i=kT  q  >  k— 1,2.... 


Although  the  optimal  T*  is  lower  than  Tjsjyq  by  an  infinitesimal 
amount,  it  is  recommended  that  a  sampling  time  between  0.1  and  0.2 
sec.  be  chosen  from  practical  considerations. 
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Figure  5.1.  Sampling  Time  Interval  and  Degree  of  Controllability 


5-15 


o 

o 


Figure  5.2.  Sampling  Time  Interval  and  Degree  of  Controllability  for 
an  Air-to-Air  Missile 


5.6  Sampling  Time  Interval  and  the  Observability  of  the  Discretized 

Systems 

In  this  section  we  formulate  an  optimization  problem  for  finding 
an  optimal  sampling  time  interval  T*  from  the  observability  viewpoint. 
The  approach  is  analogous  to  that  in  the  preceding  section.  The  cost 
functional  chosen  for  optimization  is  subjective  and  depends  upon  the 
application  of  the  discretized  system;  but  the  point  we  want  to 
emphasize  is  that  this  type  of  formulation  yields  an  optimal  unique  T. 
It  is  shown  here  how  to  formulate  the  problem  from  the  consideration 
of  sensor  sensitivity  and  optimal  use  of  sensor  measurements. 


The  observability  of  the  SD  system 
x(k+l)  =  Fx(k) ,  x(0)  unknown 

y(k)  -  Hx(k) 


(5.30) 
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is  concerned  with  the  inference  of  the  initial  state  x(0)  from 
n-observations ,  y(k) ,  k=0,-..n  -  1  and  depends  upon  the  observability 
matrix  0  in  (7b), 

Define 

Yn  =  [ y ’ ( 0)  :  y'(l):  •  •  •  y'(n-l)]' 

Then  the  estimate  of  x(0)  based  on  n-observat ions  is 
x(0)  |Yn  =  0#Yn 

where  0^  is  the  generalized  inverse  of  the  observability  matrix  0.  If 
rank  (0)=n,  Yn  lies  in  the  range-space  of  0  and  x(0)  can  be  estimated 
exactly  and 

x(0)  |Yn  =  (0f0)"10,Y n 

When  the  system  is  unobservable,  0*0  is  rank  deficient  and  the 

estimate  is  not  perfect.  The  structure  of  the  observability  matrix  0 

determines  the  ’’observability’'  of  the  system  and  the  system  continues 

to  remain  observable  as  long  as  rank  (0)=n.  To  embed  the 

observability  problem  in  a  quantitative  framework,  note  that  the 

structure  of  this  matrix  also  determines  how  a  given  initial  condition 

x(0)  (or  equivalently  any  given  state  x(k))  is  distributed  in  the 

output  sequence  (y(k) ,  k=0,...n-l}.  Maximizing  observability  by 

adjusting  T  implies  in  the  sense  of  the  that  any  initial 

2 

condition  x(0)  with  energy  | |x(0) | |  gives  rise  to  maximum  energy  in 
the  output  sequence. 

In  the  extreme  case  when  the  system  is  completely  unobservable, 
the  energy  in  the  sequence  {y(k) ,  k=0,...n-l}  is  zero  for  any  x(0). 
There  is  another  advantage  of  maximizing  output  energy.  For  a  good 
performance  from  the  sensors  it  is  desirable  to  maximize  the  energy, 
because  for  a  given  x(0)  (or  (x(k)})  and  unmeasurable  corrupting 
output  noise,  this  is  equivalent  to  maximizing  signal  to  noise  power 
ratio  and  consequently  best  sensor  performance  is  obtained.  There  is 
another  motivation  that  some  sensors  may  be  more  efficient  than  others 
and  less  efficient  sensors  will  need  higher  signal  to  noise  ratio  than 
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the  more  efficient  ones.  These  observations  suggest  a  weighted 
cost-functional  (weighted  energy  in  the  output  sequence),  similar  to 
(5.17a), 


n-1 

J  =  Z  y'(i)R(i)y(i),  R(i)  =  R'(i)  >  0  (5.32) 

i=0 

where  R(i)  determines  the  relative  importance  of  various  sensors.  We 
should  then  maximize  J.  However  (5.32)  reduces  to 

J  =  x' (0)V(0,n)x(0)  (5.33a) 


where 


n-1 

V(0,n)  =  Z  (F’)iC'R(i)C(F)i  (5.33b) 

i=0 

may  be  called  the  "output-weighted  observability  Grammian." 

The  normalized  energy  is 

T  =  J 

N  x' (0)x(0) 

and  the  minimum  possible  normalized  energy  is 
*  J 

JN  =  min  J  =  min  wnw'ol  =  £(V(0,n)  (5.34) 

x(0)eRn  x(0)  eRn  X  ('U;x('U; 

where  _c  (  . )  ,  o(.)  denote  as  usual  the  minimum  and  maximum  singular 
value  respectively.  Note  that  is  bounded  below  and  above  as 

0  <  _a(V(0,n)  <  JN  <  a(V(0,n) 

and  when  the  system  is  unobservable  ja(V(0 ,n) )=0.  The  minimum  singular 
value  of  V(0,n),  _a(V(0,n))  is  a  sensitive  measure  of  unobservability, 
because  the  system  need  not  be  completely  unobservable  for  a((V)(0,n)) 
to  be  zero.  If  any  subspace  of  Rn  is  unobservable  an  arbitrary  x(0) 
will  have  non-zero  projection  on  this  sub-space  and  o( V(0 ,n) )-0t  We 
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therefore  should  choose  T  to  maximize  JN  to  take  the  system  away  from 
unobservability  as  much  as  possible.  The  optimal  T=T*  should  then  be 
chosen  such  that 


T*  =  sup  JN 
T 


with  the  constraint  that  T*  should  be  less  than  the  Nyquist  sampling 
rate. 

Therefore,  following  the  arguments  of  the  previous  section,  we 
should  find  an  optimal  sampling  time  T*  from  the  observability 
viewpoint  by  solving  the  following  max-min  problem: 


n-1 

x1  (0)  (  E  (Ff )iGf R(i)C(F)i)x(0) 
i=0 


T*  =  max  min 

•p  e*  D 


Examples 


Example  5.3.  We  consider  again  the  example  1  of  the  previous  sec¬ 
tion  with 
the  output  matrix 

H  =  (1  0) 

The  minimum  singular  value  plot  of  V(0,2)  as  a  function  of  T  is  given 
in  figure  5.3.  Note  the  similarity  with  figure  1  and  observe  that  the 
sampling  time  at  which  the  system  loses  controllability  is  also  the 
time  at  which  the  system  loses  observability.  These  happen  at  the 
Nyquist  sampling  interval  of  1.04719  seconds. 
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SAMPLING  TIME  IN  SECS 

Figure  5.3:  Sampling  Time  Interval  and  Degree  of  Observability 

Example  5.4.  Consider  the  example  5.2  with  angle  of  attack  as  the 
output,  i.e. 

H  =  (I  0) 

jj(V(0,2))  plot  is  given  in  figure  4.  Notice  again  the  similarity  with 
Figure  2.  At  T=0.257  sec.  the  observability  is  lost. 
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SAMPLING  TIME  IN  SECS 


Figure  5.4:  Sampling  Time  Interval  and  Degress  of  Observability  of  the 
Air-to-Air  Missile 


5.7  Conclusions 


In  this  paper  we  have  described  a  framework  for  determining  a 
unique  optimal  sampling  time  T.  The  solution  T  is  given  by  a  mini-max 
problem  when  considered  from  the  controllability  viewpoint,  and  by 
maxi-min  problem  when  considered  from  an  observability  viewpoint.  The 
choice  of  cost-functionals  as  a  basis  of  an  optimization  problem  is 
very  much  a  subjective  matter  and  depends  upon  the  application  of  the 
discretized  system.  But  nevertheless,  the  framework  developed  in  the 
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paper  is  based  on  practical  considerations;  the  analysis  is  very 
simple,  and  the  results  are  extremely  useful  to  practicing  control 
engineers. 
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CHAPTER  6 


SIMULATION  RESULTS 


6 . 1  Introduction 

The  identification  technique  using  CVA  (Canonical  Variate 
Analysis)  has  been  described  in  Chapter  2  and  the  robustness  analysis 
of  the  simplified  MAC  controller  has  been  analyzed  in  Chapters  3  and 
Chapter  4.  These  results  are  combined  in  this  chapter  as  an  Adaptive 
MAC  (AMAC)  controller,  and  its  performance  will  be  demonstrated 
through  realistic  simulations  in  deterministic  as  well  as  in 
stochastic  environments.  The  simulation  runs  have  been  designed  to 
emphasize  the  effect  of  data  length,  dither  strength  (SNR),  and  closed 
loop  identification  capability  of  the  CVA  technique.  It  has  also  been 
shown  how  AMAC  behaves  for  SISO  and  MIMO  plants. 

The  primary  purpose  of  this  chapter  is  to  exhibit  the  strength  of 
the  CVA  technique  as  a  closed-loop  identifier  and  to  demonstrate  the 
reliable  adaptive  control  scheme  AMAC  which  utilizes  the  robust  MAC 
technique.  If  the  performance  of  the  CVA  technique  degrades  for  some 
reason  i.e.  the  identified  plant  is  not  'close'  to  the  actual  plant, 
the  robustness  of  MAC  compensates  for  it  in  the  sense  that  it  enables 
the  plant  to  maintain  the  closed-loop  stability  and  follow  the  desired 
trajectory. 

This  chapter  is  orgnized  as  follows:  The  simulation  models  have 
been  selected  from  the  previous  project  report  on  MAC 
( AFWAL-TR-80-3 125) .  For  the  sake  of  completeness  of  this  report,  the 
models  and  the  various  simulation  parameters  are  described  again  in 
Section  6.2.  Simulation  results  under  various  scenarios  are  presented 
in  Section  6.3.  Finally  the  summary  and  conclusions  are  given  in 
Section  6.4. 


6-1 


6.2  Simulation  Model  and  Simulation  Parameters 


The  simulation  models  have  been  selected  from  the  previous  report 
on  MAC  [AFWAL-TR-80-3125] .  The  SISO  and  MIMO  models  are  extracted 
from  a  single  hypothetical  air— to— air  missile  model  with  asymmetric 
aerodynamic  properties.  This  model  represents  a  simple,  three-axis 
attitude  control  problem  in  flight  condition  1  with  independent  pitch 
axis  and  coupled  roll-yaw  dynamics.  In  this  flight  condition  (Mach  2 
at  20,000  ft.  and  weighing  239.5  lb),  this  missile  is  flying  at  an 
equilibrium  pitch  angle  of  9°,  sideslip  of  0°  and  roll  angle  of  0°. 

6.2.1  SISO  Model 

The  SISO  Model  consists  of  the  decoupled  pitch  axis  dynamics  with 
2  states.  The  model  in  the  continuous  time  domain  is 


x(t) 


/  it(o 
\ 


(-1.4868  1 

-149.93  0 


x(t) 


(6.1a) 


y(t)  =  (1  0)x(t) 


(6.1b) 


The  states  are: 

x^(t)  =  angle  of  attack, 

x2(t)  =  perturbed  pitch  rate  (rad/sec), 

with  input  u(t)  =  elevator  angle  (rad)  and  output  y(t)  =  angle  of 
attack  (rad).  The  open  loop  poles  are  at  — 0.7434ij 12.222  with  a 
damping  ratio  of  0.061  which  shows  that  the  pitch  axis  dynamics  are 
quite  oscillatory. 

The  plant  dynamics  are  discretized  at  a  sampling  rate  of  10  Hz 
using  the  exponential  transform  (sample  and  zero  order  hold).  The 
resulting  poles  of  the  discrete  time  system  are 

0.31711±j0. 87252  (6.2) 
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with  a  modulus  of  0.92836.  The  pulse  response  and  step  response  when 
these  are  applied  at  t=0.4  seconds  to  this  system  are  shown  in  Figure 
6.1.  The  true  poles  in  equation  (6.2)  will  be  subsequently  compared 
with  those  of  the  identified  systems. 

6.2.2  MIMO  Model 

The  coupled  roll-yaw  dynamics  from  the  same  air-to-air  missile  in 
section  6.2.1  are  used  for  the  MIMO  Model.  It  has  four  states,  two 
inputs  and  two  outputs.  The  states  are 

x^(t)  =  sideship  angle  (rad) 

X2(t)  =  perturbed  roll  rate  (rad/sec) 

X3(t)  =  perturbed  yaw  rate  (rad/sec) 
x^(t)  =  roll  angle  (rad) 

with  inputs 

u^(t)  =  aileron  angle  (rad) 

U2(t)  =  rudder  angle  (rad) 

and  outputs 

y^(t)  =  sideship  angle  (rad) 
y2(t)  =  roll  angle  (rad). 

An  early  analysis  of  these  dynamics  indicated  a  very  severe  roll 
instability.  Since  MAC  can  work  only  for  systems  with  a  finite 
impulse  response,  roll  angle  and  rate  feedback  were  added  to  the 
aileron  command  to  add  damping  to  the  system  (see  the  previous  report, 
page  125).  With  such  compensation,  the  dynamics  are 
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x(t)  = 


(-0.91237  0.13708  -1.0  0.015431  \  /  0 

-1559.2  -4385.3  0  -4385.3  j  / 877Q.fi 

290.48  0  00/  x(t)  +  I  0 

0  1  0  0/  \  0 

(1  0  0  0  \ 

0  0  0  1  I  X(t) 

The  open-loop  poles  are  at 
-4384.24, 

-1.00040,  -0.484±j 17 .035 .  (6.3c) 

As  in  the  SISO  case,  the  plant  dynamics  are  discretized  using  an 
exponential  transform  for  a  sampling  interval  of  0.1  seconds.  The 
open-loop  poles  of  the  discretized  system  are: 

0.00000654,  0.9047,  -0. 12609 ±j0. 9444  (6.4) 

The  response  of  this  system  to  a  pulse  and  a  step  in  aileron  input  is 
shown  in  Figure  6.2.  The  corresponding  responses  to  similar  excita¬ 
tions  in  rudder  input  are  shown  in  Figure  6.3.  As  in  SISO  case,  these 
inputs  are  applied  at  t=0.4  seconds.  In  all  the  figures  involving 
MIM0  plant  simulations,  the  following  notations  have  been  used: 

on  output  plots: 

A  =  sideship  angle, 

B  -  roll  angle, 

on  input  plots: 

A  =  aileron  angle, 

B  =  rudder  angle. 

It  is  obvious  from  Figures  6.2  and  6.3  that  the  first  output  is  insen¬ 
sitive  to  changes  in  the  first  input  and  the  second  output  is  simi¬ 
larly  related  to  the  second  input. 


0 
0 

281.11  /u(t) 
0 

(6.3a) 


(6.3b) 
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6.2.3  Simulation  Parameters 


In  order  to  facilitate  comparison  between  related  plots,  the 
scales  have  been  kept  constant,  if  possible,  within  each  series  of 
runs.  Unless  otherwise  noted,  the  following  conditions  existed  in  the 
s imulat ions : 

•  The  sample  time  was  0.1  seconds. 

•  The  controls  were  computed  for  the  three  blocks  ending  at 
one,  three  and  five  steps  in  the  future  (for  details  of  the 
input  blocking  techniques  see  the  previous  report  on  MAC). 

•  The  reference  trajectory  time  constant  was  0.1  seconds  for 
all  outputs. 

•  No  input  constraints  were  imposed. 

•  It  was  assumed  that  the  plant  model  was  completely  unknown  at 
the  beginning. 


Therefore  the  missile  was  allowed  to  run  open-loop  for  a  while  under 
the  effect  of  dither  excitation  and  measurement  noise.  The  plant  was 
identified  at  the  end  of  this  period  which  was  then  used  by  MAC  as  an 
internal  model  of  the  plant.  The  set  points  were  then  changed  at  the 
end  of  this  interval  as  follows: 

For  the  SISO  plant,  angle  of  attack  was  set  from  0°  to  15°, 

For  the  MIMO  plant,  sideslip  was  set  from  0°  to  10°  and  the  roll 

set  point  remained  at  0°. 

•  The  output  weights  were  all  equal  to  1  and  no  input  weights 
were  used. 

•  The  input  excitation  noise  (dither)  and  measurement  noise 
were  white  Gaussian  noise  processes  generated  by  the 
subroutine  GGNML  from  IMSL  library. 
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6.3  Simulation  Under  Various  Scenarios 


Under  each  condition,  AMAC  was  applied  to  the  SISO  plant  of 
Section  6.2.1  and  the  MIMO  plant  of  Section  6.2.2.  These  results  are 
exhibited  separately. 

6.3.1  MAC  Applied  to  Perfectly  Known  Plants 

Extensive  simulation  results  under  this  condition,  i.e.  when  the 
plant  model  is  perfectly  known,  have  been  reported  in  the  previous 
report  on  MAC  [ AFWAL-TR-80-3 125 ] .  Two  of  these  results  are  reproduced 
here  for  later  comparison  with  AMAC  performances.  The  control  and  the 
output  of  the  SISO  plant  under  the  same  simulation  parameters  of 
Section  6.2.3  when  the  set  point  is  changed  from  0°  to  15°  at  0.4 
seconds  is  shown  in  Figure  6.4.  Similar  response  for  the  MIMO  plant 
for  a  set  point  change  at  7.0  seconds  is  shown  in  Figure  6.5. 

6.3.2  AMAC  Applied  to  Unknown  Plants 

The  adaptive  MAC  was  applied  to  the  plants  of  Sections  6.2.1  and 

6.2.2  and  the  results  are  shown  in  the  subsequent  figures.  The 
variance  of  the  excitation  signal  (dither)  was  0.1  and  that  of  the 
measurement  noise  was  0.05  so  that  the  signal-to-noise  ratio  (SNR)  was 
6db.  This  ratio  is  considered  to  be  realistic  by  many  practicing 
engineers.  The  dither  was  superimposed  on  the  normal  input  obtained 
from  MAC  algorithm  and  the  measurement  noise  was  added  to  the  actual 
output  of  the  plant. 

The  SISO  plant  was  identified  at  the  end  of  every  7-second  inter¬ 
val  and  the  optimal  state  order  was  selected  using  the  AIC  criteria 
(see  Chapter  2  for  details).  As  mentioned  earlier,  the  plant  was 
running  open  loop  during  the  first  interval  and  closed  loop  in  the 
subsequent  intervals.  The  control  and  the  output  sequences  are 
plotted  in  Figure  6.6  -  the  vertical  dotted  lines  in  this  and  the  sub¬ 
sequent  figures  indicate  the  length  of  the  intervals.  The  plant  is 
identified  at  the  instants  indicated  by  these  dotted  lines.  This 
figure  clearly  shows  that  under  AMAC,  the  plant  can  track  the 
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reference  input  albeit  at  the  expense  of  ride  comfort  (or  oscillations 
in  the  output).  To  see  how  CVA  performs  when  combined  with  MAC,  we 
have  compared  the  transfer  function  of  the  identified  plant  in  the 
first  interval  (i.e.  open-loop  identification)  with  the  actual  one  in 
Figure  6.7(a)  and  that  from  the  3rd  interval  (closed-loop  iden¬ 
tification)  in  Figure  6.7(b).  The  optimal  state  order  and  the  iden- 


tified  poles  during  various 

intervals  (see  Figure  6.6)  are  found  as 

follows : 

State  Order 

Poles 

Section  I  3 

0.588,  0.3437±j0.8509 

Section  II  3 

0.988,  0 .3042±j0 .8455 

Section  III  3 

0.966,  0 .2664±j0 .8308 

These  poles  of  the  identified  system  can  be  compared  with  those  of 
actual  plant  which  are  at  0 . 317 1  l±jO .87252. 

The  MIMO  plant  was  identified  every  20  seconds  under  similar  con¬ 
ditions,  the  plant  being  run  open  loop  in  the  first  interval.  The 
servo  performance  of  AMAC  under  this  run  is  shown  in  Figure  6.8.  The 
set  point  was  changed  at  the  20th  second.  The  optimal  order  and  the 


identified  poles  are: 

State  Order 

Poles 

Section  I  3 

0. 8089, -0.12066±j0. 9312 

Section  II  5 

0.5888,  0.7479,  0.835, 

-0.1358±j0.9308 

Again  these  identified  poles  may  be  compared  with  the  actual  ones  in 
equation  (6.4).  Each  element  of  the  identified  transfer  function  from 
Section  I  (i.e.,  open-loop  identification)  is  compared  with  the 
corresponding  element  of  the  actual  transfer  function  in  Figure  6.9. 
The  comparison  of  the  closed-loop  identified  system  (i.e.  from  Section 
II)  is  made  in  Figure  6*10.  Note  that  the  accuracy  of  the  transfer 
function  identification  is  essentially  the  same  for  both  the  open-loop 
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and  closed-loop  identification  which  is  a  theoretical  property  of  the 
CVA  identification  method  as  discussed  in  Chapter  2. 

6.3.3  Effect  of  Data  Length  and  Dither  Strength 

The  adaptation  interval  for  the  SISO  plant  was  reduced  from  7 
seconds  to  4  seconds  and  the  AMAC  was  applied  to  the  plant,  keeping 
other  simulation  parameters  unchanged.  But  this  time  the  identified 
plant  was  too  far  away  from  the  true  plant  and  the  inherent  robustness 
of  MAC  was  not  adequate  to  enable  the  plant  to  track  the  reference 
input.  The  closed-loop  was  unstable  as  is  shown  in  Figure  6.11.  The 
dither  strength  was  then  raised  to  1.0  thus  making  SNR  26  db.  The 
adaptation  interval  was  fixed  at  4  seconds.  This  time  the  quality  of 
the  identified  plant  was  better  and  the  plant  under  MAC  was  able  to 
track  the  reference  input  again  albeit  at  a  cost  of  much  higher 
oscillation.  The  resulting  tracking  behavior  is  shown  in  Figure  6.12. 
The  identified  plant  in  the  open-loop  and  closed-loop  environments  are 
compared  in  Figure  6.13.  The  optimal  state  orders  for  Sections  I,  II 
and  III  were  respectively  4,  3  and  6. 

For  the  MIMO  plant  the  data  length  was  reduced  from  200  to  100 
and  similar  effect  was  observed  -  the  closed  loop  was  unstable  as 
shown  in  Figure  6.14.  As  in  the  SISO  case  above  SNR  was  raised  to  26 
db  by  increasing  the  dither  strength  to  1.0.  As  shown  in  Figure  6.15, 
the  tracking  capability  of  AMAC  was  revived  again.  The  identified 
system  from  the  closed  loop  operation  is  compared  in  Figure  6.16.  The 
optimal  state  order  was  6  in  both  sections  I  and  II. 

The  simulations  in  this  section  clearly  indicate  that  the  servo 
quality  of  AMAC  can  be  improved  either  by  increasing  data  length  or 
dither  strength. 
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6.3.4  No  Measurement  Noise 


In  this  set  of  runs.  It  was  assumed  that  there  was  no  measurement 
noise  and  the  dimension  of  the  state-space  was  known  apriori.  The 
intensity  of  the  dither  signal  was  taken  to  be  0.1. 

The  SISO  plant  is  identified  every  2.5  seconds,  i.e.  only  25  data 
points  were  used  in  the  identification  algorithm.  The  result  of 
applying  AMAC  is  shown  in  Figure  6.17  and  the  transfer  function  of  the 
identified  plant  is  compared  in  Figure  6.18.  The  identified  poles  are 
follows : 


State  Order 
Section  I  2 

Section  II  2 

Section  III  2 


Poles 

0. 29 81 ±j 0.8751 
0.3161  ±j0 .8629 
0.3094±j0.8729 


Under  similar  conditions,  AMAC  was  applied  to  MIMO  plant  for  a 
data  length  of  50,  i.e.  the  identification  scheme  was  invoked  every  5 
seconds.  The  result  is  shown  in  Figure  6.19.  The  transfer  function 
of  the  identified  plant  in  closed  loop  operation  (i.e.  ,  from  segment 
III)  is  compared  in  Figure  6.20.  The  identified  poles  from  different 
segments  of  the  run  are  as  follows: 


Section  I 
Section  II 
Section  III 


State  Order 
4 
4 
4 


Poles 

0.907,-0.029, 
0.676,  0.912, 
0.888±j0.037, 


-0. 145 ±j 0.885 
-0. 1 15 ±j0  .967 
-0.1  l±j0.948 


These  plots  show  that  when  there  is  no  observation  noise,  the  CVA 
technique  can  reliably  identify  the  plant  from  a  relatively  small  data 
length. 
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6.3-5  Gust  Noise  Excitation 


To  demonstrate  the  effect  of  colored  noise  excitation  on  the 
accuracy  of  the  identified  trasnfer  function,  a  wind  gust  excitation 
of  the  form  described  in  MIL-F-8785  (Hoh  et  al,  1982)  is  used.  This 
is  in  contrast  to  the  white  noise  input  excitation  used  in  the  other 
simulations  of  this  chapter.  The  wind  gust  excitation  was  simulated 
using  a  white  noise  excitation  of  unit  variance  into  a  transfer 
function  shown  in  Figure  6.21  along  with  the  plant  transfer  function. 
The  gust  excitation  level  was  chosen  so  that  the  total  variance  of  the 
input  excitation  was  the  same  as  the  white  noise  excitation  used  in 
Figure  6.6  and  6.7. 

The  control  and  output  sequences  are  shown  in  Figure  6.22.  The 
identified  transfer  functions  corresponding  to  the  time  intervals  I 
and  III  are  shown  in  Figures  6.23  with  the  use  of  open  and  closed  loop 
data  respectively.  In  theory,  the  accuracy  of  the  identified  transfer 
function  at  different  frequencies  is  proportional  to  the  ratio  of  the 
input  excitation  power  to  the  measurement  noise  power  at  the 
frequency.  Thus  one  would  expect  to  see  a  slightly  greater  accuracy 
of  the  transfer  function  near  the  peak  of  the  gust  spectrum  and 
slightly  lower  accuracy  at  the  frequencies  with  low  power  when 
compared  with  Figure  6.7.  This  is  consistent  with  the  simulation  run, 
however  the  statistical  variability  is  high  in  comparing 
identification  accuracy  on  only  two  data  sets. 

An  implicit  input  excitation  where  the  excitation  is  not  observed 
was  also  considered.  The  result  is  of  little  use  in  transfer  function 
identification  since  only  the  magnitude  of  the  trasnfer  function  is 
obtainable  and  not  the  phase.  In  addition  the  accuracy  of  the 
magnitude  function  is  considerably  worse  than  in  the  case  of  an 
explicit  input  excitation.  Thus  the  presence  of  wind  gusts  are  of 
very  limited  value  in  plant  transfer  identification  unless  the  gust 
excitations  are  accurately  measured. 
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6.4  Conclusion 


The  simulations  of  this  chapter  have  demonstrated  the  fact  that 
the  combination  of  CVA  and  MAC  results  in  a  reliable  adaptive  control 
scheme.  This  scheme  can  be  used  in  an  environment  where  the  plant 
model  is  completely  unknown  and/or  slowly  time  varying.  The  satisfac¬ 
tory  performance  of  AMAC  demonstrated  that: 

(i)  CVA  can  identify  a  plant  satisfactorily  in  an  open  loop  as  well 
as  in  closed  loop  operation  of  the  plant. 

(ii)  The  optimal  state-order  selection  criteria  (using  AIC)  is 
extremely  helpful  when  the  state-space  dimension  of  the  true  plant  is 
not  known  apriori.  The  comparison  between  the  identified  and  the  true 
transfer  function  shows  that  this  order  selection  technique  works  very 
well  in  a  low  SNR  environment. 

(iii)  The  accuracy  of  the  identified  plant  (and  hence  the  performance 
of  AMAC)  depends  upon  data  length  and  SNR.  However  these  factors  can 
be  traded  between  one  another  -  CVA  performance  can  be  maintained  by 
using  shorter  data  length  and  larger  SNR  and  vice  versa. 

(iv)  MAC  has  excellent  robustness  properties.  As  a  result  the  closed 
loop  performances  can  be  maintained  in  many  instances,  even  when  the 
quality  of  identification  has  been  degraded. 

(v)  If  there  is  no  measurement  noise,  the  plant  can  be  identified 
from  a  much  smaller  sample  size  compared  to  the  situations  having 
measurement  noise. 

It  is  worth  noting  that  the  MAC  control  technique  is  based  upon 
the  impulse  response  model  of  the  plant  and  therefore  MAC  can  be  used 
only  for  controlling  stable  plants.  This  causes  no  problem  in  a 
deterministic  environment  if  the  plant  is  a  stable  one.  But  in  an 
adaptive  control  scheme  where  the  plant  is  reidentified  frequently, 
the  identified  plant  may  turn  out  to  be  unstable  if  the  data  length  is 
too  short  or  the  signal-to-noise  ratio  too  low  even  if  the  true  plant 
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is  an  asymptotically  stable  one.  We  indeed  faced  this  problem  in  some 
of  the  simulations  of  this  chapter,  but  the  effect  was  not  dramati¬ 
cally  visible  because  the  intervals  of  simulations  were  too  short. 
However  this  problem  can  be  remedied  by  using  Model  Predictive  Control 
(MPC)  technique  -  a  newer  version  of  MAC  which  can  handle  stable  and 
unstable  systems  with  equal  ease  in  the  same  framework. 
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Figure  6.1  Pitch  axis  dynamics  for  simulation 
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Figure  6.2  Roll-yaw  responses  due  to  aileron  movement 
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Figure  6.3  Roll-yaw  responses  due  to  rudder  movement 
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Figure  6.4  MAC  applied  to  perfectly  known  SISO  plant 
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Figure  6.5  MAC  applied  to  perfectly  known  MIMO  plant 
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Figure  6.6  AMAC  applied  to  SISO  plant 
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(a)  Open  loop  identification  form  segment  I 


(b)  Closed  loop  identification  from  segment  III 

Figure  6.7  Actual  Plant  vs  Identified  Plant 
from  the  run  of  Figure  6.6 

-  true  plant 

-  Identified  plant 
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Figure  6.8  AMAC  applied  to  the  MIMO  plant 
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(a)  (1,1)  element 


(b)  (1,2)  element 

Figure  6.9  Open  loop  identification  of  the  MIMO 
plants  (Section  I,  Figure  6.8) 
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(c)  (2,1)  element 


(d)  (2,2)  element 

Figure  6.9  (Continued) 

-  true  plant 

-  identified  plant 
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(a)  (1,1)  element 


(b)  (1,2)  element 

Figure  6.10  Closed  loop  identification  of  the  MIMO 
plants  (Section  II,  Figure  6.8) 
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Figure  6.10  (Continued) 

-  Actual  plant 

-  identified  plant 
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Figure  6.12  Effect  of  higher  SNR  on  AMAC 
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(a)  Open-loop  identification,  (Section  I,  Figure  6.12) 


(b)  Closed-loop  identification,  (Section  III,  Figure  6.12) 

Figure  6.13  Comparison  of  Identified  Plants 
(See  Figure  6.12) 
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Figure  6.15  Effect  of  increasing  dither  strength 
on  MIMO  AMAC 
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(a)  (1,1)  element 


(b)  (1,2)  element 

Figure  6.16  Identified  plant  by  increasing  dither 
strength  (Section  II,  Figure  6.15) 
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Figure  6.16  (Continued) 

-  Actual  plant 

-  identified  plant 
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Figure  6.17  AMAC  with  no  observation  noise  and 
short  data  length 
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(a)  Open-loop  identification,  (Section  I,  Figure  6.17) 


(b)  Closed-loop  identification,  (Section  III,  Figure  6.17) 

Figure  6.18  SISO  plant  identified  from  25 
data  points 
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Figure  6.19  AMAC  applied  to  MIMO  plant  (no  measurement 
noise)  and  shorter  data  length 
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(a)  (1,1)  element 


(b)  (1,2)  element 

Figure  6.20  Closed  loop  identification  of  MIMO  plant, 
50  data  points  and  no  measurement  noise 

(Section  III,  Figure  6.19) 
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(c)  (2,1)  element 


(d)  (2,2)  element 

Figure  6.20  (Continued) 

-  Actual  plant 

-  identified  plant 


6-36 
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Figure  6.21  Transfer  Function  of  Wind  Gust  Model  ( - )  and  plant  model  ( - ). 
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Figure  6.22  AMAC  Applied  With  Gust  Input  Excitation. 
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(a)  Open  Loop  Identification  from  Segment  I. 

Figure  6.23  Actual  Plant  vs.  Identified  Plant  for  Run  of  Figure  6.22. 
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(b)  Closed  Loop  Identification  from  Segment  III, 


Figure  6.23  (Continued) 
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CHAPTER  7 


CONCLUSIONS 


The  overall  conclusion  of  this  study  is  that  MAC  control  design 
technique  can  be  used  in  situations  where  the  plant  model  is  not  know 
exactly  and/or  slowly  time  varying  by  incorporating  a  suitable  on-line 
parameter  estimation  technique  in  the  existing  MAC  software.  Many  of 
the  available  techniques  for  system  identification  suffer  from  the 
fact  that  these  can  not  identify  the  system  in  a  closed-loop  con¬ 
figuration.  But  the  one  developed  in  Part  1  of  this  report  is  based 
on  canonical  variate  analysis  and  has  the  same  performances  in  both 
open-loop  and  closed-loop  configurations.  The  robustness  analysis  in 
Part  2  gives  the  neighborhood  of  stability  around  the  identified  model 
provided  that  the  nominal  MAC  loop  is  stable  for  the  identified  plant. 
Thus  combining  the  results  of  Parts  1  and  2,  adaptive  MAC  provides  an 
analytically  sound  and  very  useful  control  design  technique  in  an 
uncertain  environment  such  as  in  the  missile  attitude  control  problem 
in  different  flight  conditions  where  the  plant  model  drifts  from  one 
flight  condition  to  another.  The  problem  of  under  sampling  and  over 
sampling  can  be  avoided  by  using  the  optimum  selection  technique  deve¬ 
loped  in  Part  2 . 

Specific  conclusions  of  this  study  are: 

(i)  MAC  software  uses  impulse  response  description  of  the  plant 
and  therefore  cannot  be  used  if  the  plant  is  unstable  to  start  with. 

On  the  other  hand  if  the  plant  is  lightly  damped,  the  impulse  response 
sequence  contains  a  large  number  of  terms  and  computational  require¬ 
ments  become  large.  For  these  systems,  it  is  recommended  that  the 
plant  be  made  stable  and/or  damping  be  added  to  the  dynamics  of  the 
plant  apriori  by  using  constant  gain  output  feedback  and  then  MAC  be 
applied  to  the  overall  compensated  plant.  However  if  the  overall 
dynamics  are  made  very  fast  using  high  gain,  the  sampling  rate  must  be 
high  too  in  order  to  satisfy  Nyquist's  sampling  criteria.  However  if 
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the  plant  model  is  not  known  exactly,  gain  should  not  be  made 
arbitrarily  high  because  if  the  unmodelled  dynamics  have  non-minimum 
phase  zeros,  the  overall  combination  will  again  be  unstable. 

(ii)  The  standard  frequency  domain  robustness  analysis  can  also 
be  applied  to  a  one-step-ahead  MAC  control  law  and  thus  the  MAC 
robustness  can  be  compared  to  that  of  other  conventional  control 
design  techniques  under  similiar  situations.  The  robustness  results 
obtained  in  this  report  for  SISO  plants  can  be  extended  to  MIMO  plants 
if  the  magnitude  function  is  replaced  by  the  operator  norm  of  the 
transfer  function.  Every  nominally  stable  design  guarantees  the  sta¬ 
bility  of  a  class  of  plants  in  the  neighborhood  of  the  nominal  one, 
and  the  boundary  of  this  neighborhood  has  been  identified  in  Part  2  of 
this  report.  It  is  recommended  that,  before  applying  MAC  to  any  real 
world  situation,  the  region  of  guaranteed  stability  be  calculated  and, 
if  unsatisfactory,  enlarge  by  slowing  down  the  trajectory  time 
constants  and/or  other  parameters. 

(iii)  Any  conventional  on-line  parameter  identification  tech¬ 
nique  can  be  embedded  in  the  existing  MAC  software  to  generate  the 
internal  model  of  the  plant  and  the  resulting  control  technique  in  an 
"Adaptive  MAC".  It  is  recommended  that  the  identification  technique 
based  on  canonical  variate  analysis  developed  in  Part  1  of  this 
report  be  used  for  identifying  and  updating  the  system  parameters.  The 
advantage  of  this  technique  is  that  it  can  identify  the  plant  equally 
well  in  open-loop  and  closed-loop  configuration  and  it  can  give  the 
simultaneous  confidence  band  on  the  transfer  function  for  all  frequen¬ 
cies.  In  this  technique  the  parameters  are  updated  intermi ttant ly 
whereas  in  other  conventional  techniques  this  is  done  in  every  step. 
Although  the  computational  requirement  is  comparatively  higher  in  this 
technique,  the  quality  of  the  estimate  and  computational  reliability 

of  the  solution  justifies  this  additional  burden. 

(iv)  Sampling  interval  is  an  important  parameter  in  the  MAC 
design  process.  Usually  sampling  rate  is  selected  satisfying  the 
constraints  of  Nyquist  rate,  but  yet  the  designer  is  confronted  with  a 
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choice  from  infinitely  many  rates  satisfying  this  constraint.  The 
optimum  (possibly  unique)  sampling  rate  selection  technique  developed 
in  this  report  relieves  the  designer  form  this  problem.  The  use  of 
this  technique  is  not  limited  to  MAC  control  design  only  -  it  can  be 
used  in  any  situation  where  a  sampling  rate  is  to  be  selected. 

(v)  The  simulation  results  in  Chapter  6  demonstrate  that  the  use 
of  MAC  and  a  suitable  system  identification  method  such  as  CVA  or 
Maximum  Likelihood  provide  a  reliable  adaptive  control  method  if  there 
is  sufficient  input  excitation  or  data  length.  The  AMAC  procedure  is 
demonstrated  on  multiinput  raultioutput  systems  in  closed  loop 
operation  under  MAC  feedback  control.  The  accuracy  of  the  parameter 
identification  is  shown  to  be  the  same  in  either  open  or  closed  loop 
operation  as  is  predicted  by  theory.  The  selection  of  state  order 
using  the  AIC  procedure  in  the  CVA  method  is  shown  to  give  accurate 
model  selection  in  the  cases  where  state  order  is  unknown.  The 
accuracy  of  the  identified  plant  can  be  increased  by  increasing  the 
data  length  or  the  input  excitation  amplitude.  The  robustness  of  MAC 
can  accomodate  a  moderate  uncertainty  in  the  identified  plant,  but  for 
too  large  an  error  the  closed  loop  system  may  become  unstable. 

(vi)  The  results  of  this  study  suggest  a  number  of  fruitful  areas 
for  future  research.  The  MAC  approach  uses  the  impulse  response 
representation  of  the  plant  dynamics  which  has  the  difficulty  of  being 
unbounded  for  unstable  systems  and  very  long  for  very  lightly  damped 
systems.  Constant  gain  feedback  is  used  in  this  study  to  obtain  a 
closed  loop  system  that  is  well  damped.  A  more  direct  approach  is 
that  of  Model  Predictive  Control  (MPC)  using  a  state  space 
representation  of  the  system.  Such  a  representation  is  in  fact  the 
natural  representation  given  in  the  CVA  identification.  The  CVA 
procedure  can  be  easily  extended  to  nonlinear  systems  of  polynomial 
form.  This  would  greatly  widen  the  areas  of  application  of  the  AMAC. 
Another  area  for  research  is  the  use  of  the  confidence  intervals  on 
the  identified  transfer  function  and  the  robustness  bounds  on  the  MAC 
controller  to  determine  the  required  sample  size  or  input  excitation 
to  maintain  stable  closed  loop  operation. 
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A  UNIFIED  VIEW  OF  REDUCED  RANK  MULTIVARIATE  PREDICTION  USING 
A  GENERALIZED  SINGULAR  VALUE  DECOMPOSITION 


By  Wallace  E.  Larimore 


A  Unified  View  of  Reduced  Rank  Multivariate  Prediction 
Using  a  Generalized  Singular  Value  Decomposition 


By  Wallace  E.  Larimore 

Scientific  Systems  Inc.,  Cambridge,  Massachusetts,  USA. 
SUMMARY 


A  generalized  reduced  rank  prediction  problem,  which  is  a  generalization  of  a 
number  of  multivariate  analyses  including  the  classical  canonical  correlation  analysis,  is 
formulated  as  an  explicit  prediction  problem:  given  two  sets  of  random  variables  and  an 
integer  p,  find  p  linear  combinations  of  the  first  set  which  best  predict  the  second  set  as 
measured  in  terms  of  a  specified  quadratic  form  in  the  prediction  error.  Use  of  a  general¬ 
ization  of  the  singular  value  decomposition  reduces  this  problem  to  a  simple  form  with  an 
explicit  geometric  interpretation,  includes  the  case  of  singular  covariance  matrices,  is  the 
preferred  numerical  procedure  for  actual  computation,  and  gives  a  complete  characteriza¬ 
tion  of  nonuniqueness  in  the  case  of  multiple  solutions.  The  optimal  solution  is  shown  to 
be  a  formal  application  of  classical  canonical  correlation  analysis  to  a  "pseudo*  covariance 
matrix.  Special  cases  include  the  classical  canonical  correlation  analysis,  the  standard  as 
well  as  a  generalized  principal  component  analysis,  the  optimal  selection  of  instrumental 
variables,  and  reduced  rank  regression. 
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1.  Introduction 


In  recent  years  there  has  been  considerable  interest  in  unifying  concepts  in  multivari¬ 
ate  analysis  and  research  into  a  number  of  generalizations  (Izenman,  1975;  Muller,  1982; 
Rao,  1979).  The  approach  taken  in  this  paper  is  to  formulate  a  single  generalized  predic¬ 
tion  problem  which  includes  a  number  of  multivariate  analysis  procedures  such  as  princi¬ 
pal  component,  canonical  correlation,  reduced  and  full  rank  regression,  instrumental  vari¬ 
ables,  as  well  as  some  generalizations  of  these.  Some  of  these  multivariate  analysis  pro¬ 
cedures  are  not  traditionally  formulated  or  considered  as  prediction  problems,  and  this 
extends  the  range  of  useful  applications  for  these  methods  (Yohai  and  Garcia  Ben,  1980). 
The  prediction  problem  is  very  naturally  considered  as  a  generalized  canonical  variate 
analysis. 

A  primary  objective  of  this  paper  is  to  give  a  complete  characterization  of  the  solu¬ 
tions  of  the  generalized  prediction  problem  in  the  cases  of  multiple  solutions  and/or  singu¬ 
lar  covariance  matrices.  Such  multiple  solutions  may  arise  in  the  reduced  rank  case  with 
repeated  singular  values  or,  in  terms  of  the  traditional  formulation,  with  repeated  general¬ 
ized  eigenvalues.  Multiple  solutions  have  received  little  attention  and  seem  not  to  have 
been  characterized  from  the  geometric  point  of  view  in  terms  of  subspaces  as  is  given  in 
this  paper.  The  singular  case  has  also  received  little  attention.  This  is  probably  due  to  the 
rather  considerable  complexity  in  the  derivation  and  description  of  procedures  such  as 
canonical  correlation  analysis. 

The  classical  approach  to  reduced  rank  or  rank  constrained  problems  such  as  princi¬ 
pal  component  and  canonical  variate  analyses  has  been  the  use  of  canonical  representa¬ 
tions  which  are  obtained  by  the  solution  of  related  generalized  eigenvalue-vector  problems 
(Hotelling,  1936).  The  canonical  variables  have  a  particularly  simple  covariance  structure 
although  the  means  of  obtaining  them  are  often  quite  complicated  involving  the  solution 


of  a  constrained  maximization  problem  by  differentiation  leading  to  the  generalized  eigen- 
problem.  Most  treatments  do  not  prove  that  the  conditions  sufficient  for  the  existence  of 
such  a  maximum  are  satisfied  as  noted  by  Stuart  (1982).  Rarely  is  there  any  discussion  of 
the  multiplicity  of  solutions,  an  exception  being  Yohai  and  Garcia  Ben  (1980).  The  case 
of  singular  covariance  matrices  is  not  included  in  these  approaches  and  has  received  very 
little  attention  in  the  literature  (see  Khatri,  1976). 

In  recent  years,  the  simple  structure  of  the  covariance  matrix  of  the  canonical  vari¬ 
ables  has  been  expressed  in  terms  of  a  singular  value  decomposition  (SVD)  of  appropriate 
quantities  depending  upon  the  particular  problem  such  as  principal  components  or  canoni¬ 
cal  variates.  In  a  few  discussions,  the  derivations  were  considerably  simplified  by  the  use 
of  the  singular  value  decomposition  as  compared  with  the  classical  eigenproblem  (Good, 
1969;  Mandel,  1982;  Rao,  1979;  Stuart,  1982).  While  this  greatly  simplifies  the  derivation 
and  interpretation,  a  unified  treatment  of  the  various  reduced  rank  problems  is  not  avail¬ 
able. 

The  approach  of  this  paper  using  a  generalization  of  the  singular  value  decomposi¬ 
tion  includes  simply  the  cases  of  multiple  solutions  and  singular  covariance  matrices.  This 
approach  involves  concepts  and  methods  from  the  singular  value  decomposition  which  in 
recent  years  has  become  a  standard  tool  of  linear  algebra  for  the  investigation  of  reduced 
rank  and  illconditioned  problems  from  both  an  analytical  as  well  as  a  computational  point 
of  view  (Golub,  1969;  Lawson  and  Hanson,  1974;).  This  approach  focuses  immediately 
upon  the  central  algebraic  and  geometric  properties  of  the  problem  and  gives  the  general¬ 
ized  canonical  variables  directly.  The  generalized  singular  value  decomposition  reduces 
the  optimal  prediction  problem  to  a  simple  form  which  is  directly  and  easily  solved  using 
elementary  properties  of  orthonormal  matrices.  This  avoids  the  need  to  solve  a  con¬ 
strained  maximization  problem  by  differentiation  using  Lagrange  multipliers  which  is  the 
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traditional  approach  to  canonical  variate  analysis. 

The  generalized  singular  value  decomposition  provides  a  unification  on  several  levels. 
A  single  mathematical  framework  using  the  generalized  singular  value  decomposition 
solves  a  single  generalized  problem  that  can  be  specialized  to  the  various  reduced  rank 
prediction  problems.  This  unified  treatment  gives  the  complete  multiplicity  of  solutions 
for  cases  with  repeated  singular  values  and  simultaneously  includes  the  case  of  a  singular 
covariance  matrix  largely  missing  in  the  literature.  Also  there  is  unification  using  the  gen¬ 
eralized  singular  value  decomposition  in  the  derivation  of  the  proof,  the  mathematical 
statement  of  the  results,  the  geometric  interpretation  of  the  prediction  problem  and  its 
solution,  and  the  computation  of  the  solution  using  modem  numerical  methods  that  are 
numerically  accurate  and  stable.  This  gives  a  considerable  unification  of  the  teaching, 
understanding,  interpretation,  and  application  of  these  methods.  The  diversity  and  com¬ 
plexity  of  the  present  literature  makes  the  learning  and  understanding  of  such  methods  as 
canonical  correlation  analysis  difficult  for  many  potential  users,  and  is  considered  by  some 
to  be  largely  responsible  for  its  relative  neglect  in  applications. 


2.  A  Generalized  Prediction  Problem 

Consider  two  sets  of  zero  mean  random  variables  XT  —  (xj, . .  .  ,xm)T  and 
YT  =  (yi,...,yll)T  with  a  joint  covariance  matrix  of  (XT  ,YTf  given  by 


2  2  1 


\XJT 

Sja 

2*y 

(12) 


where  and  S*  are  possibly  singular.  In  this  paper,  the  following  constrained  predic¬ 
tion  problem  is  considered:  for  a  given  p,  find  a  p -dimensional  vector  Z  =  HpX  of  linear 
combinations  of  X  such  that  the  optimal  prediction  Yz  of  Y  based  upon  Z  minimizes  the 
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general  quadratic  prediction  error  measure 


II  i'-nll  l>  =  evy-yJ  A\r-y,)}  (2  2) 

where  A  is  an  arbitrary  nonnegative  definite  symmetric  matrix  of  rank  n,  and  t  denotes 
the  pseudoinverse,  i.e.  the  inverse  of  the  full  rank  part  of  A.  Let  L  satisfy  LALT  —  / 
where  L  is  full  rank  with  dimension  n  Xn ,  then  it  will  be  convenient  to  express  A*  =  LT L . 
From  the  eigenvector  decomposition  of  a  matrix,  the  rows  of  L  span  the  same  subspace  as 
the  eigenvectors  of  A  with  nonzero  eigenvalues.  Such  an  L  will  occur  naturally  in  the 
generalized  singular  value  decomposition.  Although  the  use  of  the  inverse  or  pseudoin¬ 
verse  in  the  definition  of  the  prediction  problem  may  appear  awkward,  it  will  lead  to  con¬ 
siderable  simplicity  in  formulating  the  mathematical  problem  to  be  solved  and  in  the 
geometrical  and  statistical  interpretation  of  the  resulting  solution.  The  prediction  problem 
(22)  is  considered  in  the  case  that  A  is  full  rank  by  Izenman  (1975)  and  Rao  (1979).  Lari- 
more  (1983)  extends  the  prediction  problem  to  the  case  of  time  series  analysis  of  Markov 
processes  of  constrained  Markov  (state)  order  p . 

In  the  paper,  the  geometrical  interpretation  will  play  an  important  part.  A  linear 
vector  space  V  of  random  variables  generated  by  a  set  S  of  random  variables  is  defined  as 
the  set  V  of  all  random  variables  that  are  linear  combinations  of  S .  In  the  sequel,  several 
inner  products  <u,v>r  -  EutTv  for  u,v  £V  will  be  defined  for  various  positive  semidefin- 
ite  symmetric  matrices  I\  Two  random  variables  u  and  v  are  orthogonal  with  respect  to 
the  inner  product  <.,.>p  if  <u,v>r  =  0,  and  a  set  of  random  variables  uj, . . .  ,un  are 
orthonormal  if  they  are  orthogonal  and  in  addition  =  1.  Then  all  of  the  usual 

properties  of  inner  product  spaces  apply  to  such  a  space  of  random  variables  such  as  sub¬ 
space,  rank  of  a  subspace,  and  linear  independence  of  vectors. 
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Consider  the  case  where  is  full  rank.  Then  for  an  arbitrary  p -vector  Z  =  HpX  of 
linearly  independent  combinations  of  X  where  p  =s  m  so  Hp  is  rank  p  ,  the  optimal  esti¬ 
mate  Yz  of  Y  given  Z  is 

Yz  =  X„  X"1  Z  =  X^  HTp{Hp  X„  Htp)~'z  (23) 

and  the  prediction  error  is 

||  Y-Yz\ |  Jt  =  tr  AtX>y  -  rrA%,  //J(//p  X„  (2.4) 

Now  Hp  does  not  uniquely  specify  Z  in  terms  of  estimating  Y  since  from  inspection  of 
(23)  any  nonsingular  transformation  of  Z  will  leave  Yz  invariant.  An  orthonormalization 
of  Z  will  give  an  equivalent  Z  =  JpX  with 

X3  =  Jp  JTp  =  Ip  (25) 

where  Ip  is  the  p  xp  identity  and  where  the  last  equality  is  satisfied  if  Rank  (X^ )  ^  p . 

In  the  singular  case  where  RankQ^)  <  p,  then  by  an  orthonormalization  of  Xs  a 
new  set  of  random  variables  Z  =  AZ  =  AJpX  —  J-X  of  lower  dimension  p  can  be  chosen 
with  a  full  rank  covariance  matrix  equal  to  the  identity.  For  this  new  orthonormalized  set 
of  random  variables,  dropping  the  bar  notation  we  have  precisely  (25).  Note  that  by 
replacing  Hp  by  Jp,  the  inverses  in  (23)  and  (2.4)  are  then  also  well  defined.  We  may 
thus  in  any  case  introduce  the  constraint  (25)  on  Jp  without  loss  of  generality.  The 
optimum  prediction  problem  (22)  can  thus  be  stated  mathematically  as  choosing  a  Jp  to 
minimize 

||  Y-Yz\ 1 1  =  trA%  -  trA%  (2.6) 


subject  to  the  constraint 
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(2.7) 


Jp  ^  Jp  =  'p 


This  problem  is  of  great  practical  interest.  The  classical  canonical  correlations  and 
variates  analysis  will  be  shown  to  be  equivalent  to  minimizing  (22)  with  A  =  2^  .  The 
principal  component  analysis  problem  is  equivalent  to  Y  =  X  so  2^  =  2^  =  2^  and  in 
addition  setting  A  =  /  .  More  general  weightings  are  afforded  by  other  choices  of  A 
which  can  reflect  a  cost  of  prediction  error  of  practical  value  such  as  dollars  or  a  second 
order  approximation  to  a  nonlinear  cost  function.  The  particular  weighting  A  used  in  a 
given  problem  can  make  a  considerable  difference  in  the  solution,  which  suggests  that  the 
classical  canonical  correlation  analysis  in  some  cases  does  not  give  the  most  appropriate 
choice  of  A  .  The  generalized  canonical  variate  analysis  provides  a  unified  framework  for 
canonical  correlation  analysis  and  principal  component  analysis  as  well  as  more  general 
prediction  problems. 

3.  A  GENERALIZED  SINGULAR  VALUE  DECOMPOSITION 

A  very  intuitive  approach  to  finding  the  canonical  decomposition  is  through  one  par¬ 
ticular  generalization  of  the  singular  value  decomposition.  The  usual  singular  value 
decomposition  is  given  by  the  following  (Lawson  and  Hanson,  p.  20-1,  1974). 

Theorem  1.  If  A  is  a  real  m  Xn  matrix  of  rank  r,  then  there  exist  orthonormal 
matrices  B(mXm)  and  C(nXn)  such  that 

BtAC  =  Diag(dl  0,...,0)  ,  BTB  =  Im  ,  CTC  =  /„  (3.1) 

where  Diag  denotes  a  mX/i  diagonal  matrix  with  nonnegative  elements  in  decending 
order. 
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In  generalizing  this,  let  P  be  a  nonnegative  definite  symmetric  matrix  of  rank  r  . 
Then  we  define  a  matrix  J  to  be  P -orthonormal  (rowwise)  when  JPJ T  =/r.  Note  that 
this  definition  includes  the  requirement  that  the  dimension  of  J  is  RankP  xDimP  with  J 
full  rank.  This  can  all  be  conveniently  stated  simply  as  JPJT  =  7^^.  so  that  the  dimen¬ 
sion  and  ranks  of  P  and  J  do  not  have  to  be  explicitly  stated.  Also  throughout  the 
paper,  D  =  Diag(dn,  •  •  •  ,du,  -  •  ■ )  will  denote  a  general  rectangular  matrix  with  all  ele¬ 
ments  zero  except  for  elements  du  on  the  main  diagonal.  Then  the  ( RJS)-singular  value 
decomposition  is  given  by  the  following  theorem. 

Theorem  2.  Let  R  and  S  be  nonnegative  definite  symmetric  matrices  of  order  m  and 
n  and  ranks  in  and  it  respectively,  and  let  A  beamXn  matrix.  Then  there  exist  transfor¬ 
mations  J  and  L  such  that 

JALt  =  D=  Diag(yl  >...>  7r  >  0,...,0)  ,  JRJT  =  ,  LSLT  -  1^  (32) 

Thus  the  transformations  J  and  L  are  R-  and  S -orthonormal  respectively  and  in  addition 
satisfy  the  following: 

(i)  For  distinct  singular  values  7i's,  the  row  vectors  of  J  and  L  are  unique  except  for  a 
sign  change. 

(ii)  For  repeated  singular  values  7i's,  the  rows  of  J  corresponding  to  a  given  repeated 
singular  value  must  span  a  fixed  subspace,  and  similarly  for  L. 

(iii)  Any  transformations  J  and  L  satisfying  the  decomposition  (32)  are  related  to  a  par¬ 
ticular  solution  /.  in  terms  of  a  block  diagonal  orthonormal  matrix  of  the  form 

J  =  Diag (P j . PhJ3u)Jm  ,  L  =  Diag(Plt  .  . .  ,Ph ,PV)L.  where  the  blocks  P}  are  arbitrary 

orthonormal  matrices  which  for  j^h  have  dimension  kj  xkj  corresponding  to  the  7-th 
nonzero  value  of  7  that  repeats  kj  times  and  where  Pu  and  Pv  are  orthonormal  matrices  of 
dimension  Rank(R)  -  r  and  Rank(S)  -  r  respectively.  Thus  for  any  J  and  L,  the  rows 


corresponding  to  the  same  singular  value  are  orthonormal  linear  combinations  of  the 
corresponding  rows  of  J»  and  L.  . 

Proof:  Existence:  Let  B  and  C  be  any  R-  and  5 -orthonormal  matrices  respectively 
so  that  BRBt  =  IfadtR  and  CSC 7  =  Ir^m  .  Now  consider  the  singular  value  decomposition 
of  Theorem  1  applied  to  BACT  ,  so  B7  BAC7 C  =  D  with  B7  B  =  /  =  CTC  .  Then  J  =  B7  B 
and  L  =  CTC  satisfy  (32). 

Uniqueness:  To  determine  all  solutions,  let  J ,  L,  and  D  be  another  solution  satisfy¬ 
ing  (32).  Then  J7  RJ  =  I r^^  =  JTRJT  implies  that  the  row  vectors  of  J  and  J  span  the 
full  rank  subspace  of  R  .  Thus  there  exists  a  nonsingular  matrix  F  such  that  J  =  FJ  and 
similarly  there  exists  a  nonsingular  matrix  G  with  L  =  GL  .  From  the  decomposition  (32), 
I  Rank*  ~  JIV*  —  FJRJtFt  =  FFt  and  similarly  GGT  =  so  that  F  and  G  are 

orthonormal  matrices.  Also  DDT  =  FDD7 FT  ,  and  from  the  uniqueness  of  the  eigen¬ 
values  and  eigenvectors  of  a  symmetric  matrix  it  follows  that  DD7  =  DD7  so  D  =  D  and 
that  F  is  block  diagonal  with  blocks  corresponding  to  the  repeated  singular  values.  A  simi¬ 
lar  result  holds  for  G  by  considering  D7 D  .  Now  D  =  D  =  FDG 7  ,  so  using  the  block 
diagonal  forms  of  D,  F ,  and  G  with  diagonal  blocks  Dh  Fh  and  Gf  respectively,  we  have 
for  every  block  i  with  =  0  that  7,/  =  F i  ~iJGj  so  FfiJ  =  /  =  Fi F7  which  implies 
Ft  =  Gi  since  they  are  both  square  matrices  which  proves  the  Theorem. 

One  generalization  of  the  singular  value  decomposition  proposed  by  Van  Loan  (1976) 
is  somewhat  different  defining  P  -orthonormality  column  wise  and  using  the  inverse  of  the 
transformation  J  so  that  the  decomposition  satisfies  the  following:  J7RJ  =  , 

LT SL  =  ,  J~tAL  =  Diaghi  >...>  yr,0 _ ,0)  .  If  we  make  the  identification 

J  —  J~T  ,  L  =  L7  ,  and  R  =  R~l  ,  then  J ,  L,  R,  S  and  D  satisfy  the  generalized  singular 
value  decomposition  (32).  From  a  statistical  point  of  view,  this  decomposition  is  much 
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more  intuitive  as  is  seen  in  the  next  section,  and  treatment  of  the  case  of  singular  covari¬ 
ance  matrices  would  be  considerably  more  involved  using  the  Van  Loan  decomposition. 

4.  GENERALIZED  CANONICAL  VARIATE  ANALYSIS 

Now  consider  the  (£„  ,A)-singular  value  decomposition  of  2^  given  as 

Jl^L7  =D  =Diag{ 7i . 7r,0,-.,0)  ,  JXaJT  =  / -  ,  LAL7  =  I-  (4.1) 

where  m  —  Rank  )  and  n  =  Rank  (A).  This  decomposition  has  the  very  intuitive 
interpretation  of  a  new  basis  defined  by  the  generalized  canonical  variables  or  variates 

U  =JX  ,  V  =  LY  (42) 

of  dimensions  m  and  n  respectively  for  which: 

(i)  =  /  ,  so  that  the  components  of  U  are  uncorrelated  with  variance  unity. 

(ii)  Xm  =  Diag  (yl . yr,0 . 0)  ,  so  that  the  components  of  U  and  V  are  uncorre¬ 

lated  except  for  the  i-th  pairs  with  covfavj)  —  y4-  .  The  will  be  called  canonical 
covariances . 

(iii)  the  norm  of  the  prediction  error 

||  Y-Yz\ |  Jt  =  E{(Y-YzfLTL(Y-Yz)}  =  E  {(V -V  zf  (V -V  z)}  =  ||  V-Vz\\  f  (43) 

is  a  sum  of  squares  in  V-Vz  where  Vz  =  LYZ,  and  the  inner  product  induced  by  the 
transformation  L  is  <T1,T2>a  =  E{Y\l7LY^  =  <V1,Vr2>/,  the  inner  product  with  respect 
to  the  identity. 

(iv)  the  projection  of  the  prediction  error  Y-Yz  on  the  full  rank  subspace  of  A,  i.e.  where 
the  prediction  error  has  nonzero  weighting,  is 
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AA\Y-YX)  =  A LtL(Y-Yz)  =  A Lt(V-Vz) 


(4.4) 


which  gives  the  inverse  transformation  from  V  onto  the  full  rank  subspace  of  A. 

As  in  the  discussion  following  (25),  U  contains  the  part  of  X  involving  the  full  rank 
part  of  2^ .  Thus  without  loss  of  generality  X  may  be  expressed  as  X  =  KU  .  In  terms  of 
the  canonical  variables  U  ,  the  p  linear  combinations  Z  are  Z  =  JpX  -JpKU  =  MpU 
where  we  define  the  p  Xm  matrix  Mp  =  JpK  .  Using  the  constraint  (25)  gives  the 
equivalent  constraint 

MpMTp  =  MptmMTp  =ZU=IP  (45) 

so  that  Mp  has  orthonormal  rows.  Furthermore  since  Z  =  JpX  =  MpU  =  MpJX,  we  can 
substitute  into  (2.6)  the  relationships  Jp  =  MpJ  and  At  =  LTL .  Use  of  the  generalized 
singular  value  decomposition  (4.1)  then  gives  the  simple  expression 

II  Y-Yz\\  Jt  =  II  V-Vz\\  f  =  tr Af2jy  -  trMpDDTMTp  (4.6) 

Thus  the  generalized  singular  value  decomposition  (23)  reduces  the  original  problem  of 
minimizing  (2.6)  subject  to  the  constraint  (2.7)  to  the  problem  of  finding  apXm  matrix  Mp 
with  orthonormal  rows  maximizing  trMpDDT Mp  with  D  =  Diag (yj  a  •  ■  •  a  7r  >  0,  •  •  •  ,0) 

.  To  solve  this  maximization  problem  requires  only  jthe  elementary  properties  of  orthonor¬ 
mal  matrices  as  stated  in  the  following  Lemma. 

Lemma  1:  Let  the  integer  p  ^  in  be  fixed,  let  mj  be  the  columns  of  the  p  Xm  matrix 
Mp  with  orthonormal  rows,  and  suppose  the  mXn  diagonal  matrix  D  is 

D  =  Diagii! . 7,>7,+i  =  •  •  *  =  Ip  =  •  •  •  =  7,+*>7,+t+i,  '  )  for  k  repeated 

values  equal  to  yp,  so  for  yp  unique  we  have  q  +1  =  p  =  q+k  +1.  Let  Mq  -  [mlP  . .  .  ,mq\, 
M  '  •  ■  ’^q+k  ],  Mm  =  [m,+Jk+1,  .  .  .  ,m-].  Then  TrMpDDTMJp  is  a  maximum  if  and 
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only  if  all  of  the  following  hold: 


MqTMq  =  I  ,  MqTMk  =  0  ,  Mm  =  0 


(4.7) 


Proof:  By  Gram-Schmidt  orthonormalization,  the  m  xp  matrix  Mp  may  be  extended 
to  a  square  mXm  matrix  [AfJ  NT]  with  orthonormal  columns.  Thus 


Mp 

N 


\MT  nt\  = 

A  J 


0  'a-. 


!<  «r] 


", 

// 


(4-8) 


where  the  second  equality  follows  since  a  right  inverse  is  also  a  left  inverse.  In  particular, 
denoting  the  «'-th  column  of  Mp  and  N  by  m,  and  nt  respectively,  we  have  mjmi+njni  =  1 
so  that  mfmj  ^  1  .  Furthermore 


P  =  tr/p  =  tr(MpMp )  =  tr(M*Mp)  =  ^rnfm, 

i-1 

=  trMqTMq  +  trMkTMk  + 

Using  mfmi  ^  1  implies  the  inequality 


(4.9) 


tr(MTpMp)DDT  =  fyfmfm,  ^ 

i= 1  1 


(4.10) 


By  considering  mfmj  =  at  as  arbitrary  positive  numbers  whose  sum  is  p ,  it  is  easily  shown 
that  the  equality  is  achieved  if  and  only  if  mfmt  =  1  and  njnt  =  0  and  mfml  =  0  for 
q+k  <  i  ^  m. 

(Only  if).  Now  if  we  partition  N  similar  to  that  of  Mp  so  N  =  [0  Nk  Nm],  then 
when  the  maximum  is  achieved  we  must  have  from  (4.8) 


MqT 

MkT 

0 


0 

NkT 

Na- 


Mq  Mk 
0  Nk 


0 

Nm 


MqTMq  MkTMq  0 


MkTMq  MkTMk+NkTNk 
0  N"trNk 


NkTNn 

NxrNm 


(4.11) 
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which  implies  that  MqTMq  =  Iq  and  MqTMk  =  0. 

(If).  Suppose  that  (4.7)  is  true.  Then  from  MqTMq  =  Iq,  Mm  —  0  and  (4.9), 

trMkTMk  =  ^  mjmi  =  p-q  (4.12) 

i=q+ 1 

Then  using  (4.10), 

Tr{MTpMp)DDT  =  -  £7j2  +  7,+1  trMkTMk  =  (4.13) 

i=  1  i=l  i=l 

so  that  the  maximum  is  achieved  which  proves  the  Lemma. 

5.  Optimal  Prediction  via  Generalized  Canonical  Variables 

Using  the  above  reduction  to  canonical  variables  and  previous  Lemma,  solutions  to 
minimizing  the  prediction  error  (22)  are  characterized  simply  in  terms  of  the  generalized 
canonical  variables  from  the  generalized  singular  value  decomposition  (4.1).  The  solution 
is  given  by  essentially  choosing  Z  as  the  first  p  canonical  variables,  although  for  repeated 
singular  values  it  is  somewhat  more  involved  in  that  any  p-q  dimensional  subspace 
corresponding  to  the  repeated  singular  value  may  be  chosen.  The  uniqueness  of  the  gen¬ 
eralized  singular  value  decomposition  exactly  characterizes  the  nonuniqueness  of  the 
canonical  variables  and  the  solution  to  the  optimal  prediction  problem  (22).  This  is  pre¬ 
cisely  stated  in  the  following  theorem. 

Theorem  3:  Consider  the  problem  of  choosing  p  linear  combinations  Z  =  HpX  of  X 
for  predicting  Y  such  that 

||  Y-Yz\ |  Jt  =  E{(Y-YjA\Y-Yz)}  (5.1) 
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is  minimized  where  and  A  are  possibly  singular  positive  semidefinite  symmetric 
matrices  with  ranks  m  and  n  respectively.  Then  existence  and  uniqueness  are  given  as 
follows: 

(i)  Existence:  Z  =  HpX  is  a  solution  minimizing  (5.1)  if  and  only  if  there  exist 
transformations  J  and  L  satisfying  the  ,A)-generalized  singular  value  decomposition 

J^JT  =  la  ,  L\Lt  =  1-  ,  JXjcyLT  =  Diag(yi  =£...*  yr,0 . 0)  (52) 


such  that 

(a)  Z  =HpX  spans  the  first  p  of  the  canonical  predictors  U  =  JX ,  i.e.  we  have 
Z  =  QUp=Q[Ip  0]C/  =  Q[lp  0]IX  for  some  nonsingular  Q  .  Thus  Hp  =  Q[Ip  0J/  ,  so  the 
rows  of  Hp  are  linearly  independent  linear  combinations  of  the  first  p  rows  of  J .  In  addi¬ 
tion  we  have: 

(b)  the  prediction  error  is  reduced  in  the  span  of  the  corresponding  first  p  canonical 
variables  V ,  and  the  corresponding  subspace  of  Y  is  the  span  of  the  random  variables  con¬ 
sisting  of  linear  combinations  of  Y  given  by  the  first  p  rows  of  L . 

(c)  there  is  no  reduction  in  the  prediction  error  in  the  span  of  the  last  n-p  variables  of 
V ,  and  the  corresponding  subspace  of  Y  is  the  span  of  the  random  variables  consisting  of 
linear  combinations  of  Y  given  by  the  last  n-p  rows  of  L . 

(ii)  Uniqueness: 

(a)  If  yp+1  >  yp  then  the  solution  in  (i)  is  essentially  unique,  i.e.  the  subspaces  in  (a),  (b), 
and  (c)  are  unique  and  given  by  any  particular  representation  (52). 

(b)  If  yp  =  yp+1  with  k  equal  singular  values  yq+l  =  •  •  •  =  yp  =  •  •  •  =  ,  then  the 

subspaces  in  (i)  are  not  unique.  The  subspace  span  by  Z  contains  the  first  q  canonical 
predictors  Uq  =  [/?  0}/X  and  in  addition  contains  an  arbitrary  selection  of  p-q  linear 
combinations  of  the  canonical  variables  uq+1,  •  •  •  ,uq+k.  In  particular,  Hp  has  the  form 
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[/,  o  q 


(53) 


where  is  (p-q)xk  with  orthonormal  rows  so  Cp^Cp^  =  Ip_g  and  Q  is  an  arbitrary 
nonsingular  matrix. 


p-qp-q 


(iii)  The  minimum  value  is 


(5.4) 


Proof:  (i)  Existence  -  (only  if).  Suppose  Z  =  HpX  is  given  which  minimizes  (5.1), 
then  we  seek  a  generalized  singular  decomposition  satisfying  (52)  and  (i)(a).  To  simplify 
the  derivation  we  work  with  the  equivalent  Z  =  JpX  as  in  (2.6)  subject  to  the  constraint 
(2.7).  Now  consider  a  fixed  decomposition  (52)  with  J  and  L  given  with  corresponding 
canonical  variables  U  and  V.  From  the  discussion  following  (4.4),  there  exists  a 
Mp  =  JpK  satisfying  (45)  and  minimizing  (4.6).  We  use  Mp  to  construct  7  and  L  satisfy¬ 
ing  (52)  and  (i)(a). 

As  in  Lemma  1,  suppose  that  Mp  —  [Mq  Mk  0]  minimizes  the  prediction  error. 

From  the  nonuniqueness  of  the  generalized  singular  value  decomposition,  the  problem  is 
to  select  a  new  basis  from  among  the  columns  of  Mk  which  is  full  rank  and  use  this  in  the 
construction.  This  is  simply  accomplished  by  considering  the  generalized  singular  value 
decomposition  of  the  matrix  Mp  with  respect  to  the  identity  matrices  given  by 
FMpG T  =  D ,  FFt  =  Ip,  GGt  =  From  the  orthogonality  of  Mp,  we  have  lp  =  FFT 
=  FMpMpF T  =  DDt  so  that  D  =  [Ip  0].  Partitioning  the  various  matrices  in  this  singular  ♦ 

value  decomposition  corresponding  to  the  partitioning  of  Mp  =  [Mq  Mk  0]  gives 

* 

[FMqGq  FMkGk  0]  =  [Ip  0].  A  reordering  of  columns  of  F  gives  of  the  generalized 
singular  value  decomposition  of  the  p  xk  matrix  Mk  with  respect  to  the  identity  matrices 
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as 


,  BBt  =Ip  ,  CCT  =/t  (55) 

where  k  >  p-q.  The  rank  of  M*  is  thus  obviously  p-q.  Let  Bp_q  =  [/  0]B  and 

Cp =  \Ip_q  0]C  be  the  first  p-q  rows  of  B  and  C  respectively. 

Now  consider  the  transformation  on  the  variables  Z  to  Z 


BMkCT  = 


p-q  u 
0  0 


Z  = 


MqT\ 


(5.6) 


In  the  sequel,  we  will  need  the  property  that  =  0  which  is  the  case  if  and  only  if 

MqT  Bp^Cp^  =  0  since  Cp~q  is  full  rank.  This  indeed  follows  from  Lemma  1  since 


M"TBTp.,Cp^ 


=  MqTBT 


Up-q 


0]C  =  MqTBTBMkCTC  =  MqTMk  =  0 


(5.7) 


Thus  the  matrix  of  (5.6)  is  orthonormal,  and  using  Z  =  M  U  =  M.JX  it  follows  that 


Z  = 


MqT] 

B 


[Mq  Mk  0]C7  = 


0 


0 


0  [/_  0  ]BMkCTC  0 


p-q 


U  = 


fq  0  Q 
0  Cp^  0| 


JX 


“  [Ip  0 yx  =  [Ip  0 ]U  =  Up  (5.8) 

where  J  =  Diag[Iq,C  .  From  Theorem  2  (iii),  the  transformations  J  and 

L  =  Diag[Iq,C  J-_k  y  are  just  an  alternate  set  of  matrices  satisfying  the  generalized 
singular  value  decomposition  (52).  Thus  (i)(a)  is  satisfied  since  Z  in  (5.6)  is  given  by  a 
nonsingular  linear  transformation  of  Z  which  is  by  construction  Up . 
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(iXif).  Suppose  Hp  =  Q[Ip  OJ/,  with  J  and  L  satisfying  (52).  Then  by  Lemma  1, 
with  Mp  =  [Ip  0],  (4.6)  is  minimized  so  that  Z  =  HpX  is  a  solution  minimizing  (5.1). 

To  show  (i)(b)  and  (c),  consider  the  prediction  error  V-V  in  V  where 
V  =  2^  \Ip  Of Ip  [Ip  Op  as  in  (23).  The  reduction  in  prediction  error  is 

=1-1  +DT[Ip  Of  [Ip  Op  =  Diag[y\, - 7^,0 . 0]  (5.9) 

which  proves  (b)  and  (c). 

(ii)  Uniqueness:  Suppose  that  there  are  two  solutions  satisfying  (52)  which  minimize 
the  prediction  error  (5.1).  Then  by  the  uniqueness  of  the  generalized  singular  value 
decomposition  from  Theorem  2,  the  respective  J  and  L  matrices  are  related  by  a  block 
diagonal  orthonormal  matrix.  If  yp  is  unique,  then  so  is  the  subspace  span  by  the  rows  of 
Jp  which  proves  (iiXa).  If  yp  is  not  unique,  then  a  choice  of  a  different  generalized 
singular  value  decomposition  relates  to  a  different  choice  of  basis  for  the  k  -dimensional 
basis  corresponding  to  the  singular  value  yp .  Thus  there  is  an  arbitrary  choice  of  a  p-q 
dimensional  subspace  from  the  rows  q  + 1,  .  .  .  ,q+k  of  J  giving  the  canonical  variables 

uq  + 1 . Up.  The  matrix  Cp =  [7p_^  Op  is  constructed  in  (55)  which  proves  (ii)(b). 

The  minimum  value  is  given  by  setting  Mp  =  (Ip  ,0)  in  (4.6)  so  trMpD2Mp  =  y2+  ■  ■  •  Jry2  . 
This  proves  the  theorem. 

6.  Multivariate  Reduced  Rank  Prediction 

The  Theorem  3  includes  a  number  of  special  cases  that  arise  in  the  analysis  of  mul¬ 
tivariate  data.  A  particular  solution  to  the  general  prediction  problem  (22)  in  the  case  of 
A  nonsingular  is  given  by  Izenman  (1975)  and  Rao  (1979),  although  the  solution  is  not 
unique  if  the  generalized  singular  values  are  not  distinct.  The  classical  canonical 
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correlation  analysis  problem  is  obtained  if  we  set  A  =  X^  ,  since  then  the  normalization 
L2,„Lt  -  Iv^y  implies  that  the  variables  V  -LY  are  orthonormal  and  hence  is  a 

jJ 

correlation  matrix.  Infact  the  solution  (52)  then  reduces  to  the  canonical  relationships 
-  lm  ,  ==  /„  ,  2„v  =  Diag(y 1(  .  .  .  ,yr,0,  .  .  .  ,0)  which  are  a  central  aspect  of 

canonical  correlation  analysis  (see  e.g.  Rao,  1973,  Sec.  8fJ2(iv)).  If  A  =/,  the  solution  is 
different  from  canonical  correlation  analysis  unless  =  I  . 

The  principal  component  analysis  problem  is  given  by  X  =  Y  and  A  =  /  ,  so  that  the 
norm  is  ||X-XZ||  }  =  E{(X-XZ)T (X~XZ)}.  A  generalization  of  principal  component  analysis 
is  obtained  by  setting  Y  =  X  so  X^  =  X^  =  Xw  and  using  an  arbitrary  positive  definite 
symmetric  weighting  A  so  the  norm  to  be  minimized  is 
||  X-Xj|  =  £{(X-Xz)rA_1(X-Xz)}.  A  different  generalization,  principal  component 

analysis  of  instrumental  variables,  is  discussed  in  Rao  (1965).  The  problem  is  equivalent 
to  setting  A  =/  so  the  prediction  error  norm  is  ||  K-Kjl  /  =  £{(y-i'z)r(l'-l?z)}.  The 
canonical  variables  U  are  called  the  principal  components  of  the  instrumental  variables  X 

In  the  above  particular  cases,  the  derivation  and  proofs  in  the  cited  references  all 
assume  that  the  matrices  X*,  and  A  (i.e.  X^  or  I)  are  nonsingular.  The  only  discussion  of 
the  singular  case  seems  to  be  Khatri  (1976)  for  the  canonical  correlation  analysis  which  is 
much  more  complicated  than  the  present  approach. 

The  definition  and  properties  of  the  generalized  singular  value  decomposition  clearly 
express  the  fundamental  properties  of  these  multivariate  prediction  problems.  Mathemati¬ 
cally,  geometrically  and  statistically  the  fundamental  relationship  is  the  selection  of  the 
canonical  variables  U  and  V  by  selecting  the  transformations  J  and  L  of  the  random  vari¬ 
ables  X  and  Y .  The  fundamental  geometrical  properties  of  these  transformations  are  that 
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J  and  L  are  orthonormal  with  respect  to  the  matrices  and  A  while  simultaneously  they 
are  orthonormal  with  respect  to  'Zxy  except  in  corresponding  pairs.  This  is  concisely  stated 
mathematically  by  the  generalized  singular  value  decomposition  which  includes  the  gen¬ 
eral  case  of  singular  matrices.  These  mathematical  orthonormality  relationships  have 
immediate  and  direct  statistical  interpretation  in  terms  of  the  identity  covariances  of  U 
and  V  ,  the  mutual  zero  correlation  between  U  and  V  except  in  pairs,  and  the  sum  of 
squares  property  of  the  prediction  error  V-V  with  the  addition  of  more  predictor  vari¬ 
ables  from  U .  The  different  multivariate  prediction  problems  correspond  only  to  a  dif¬ 
ferent  selection  of  the  random  variables  X  and  T  and  the  matrix  A  involved  in  the  weight¬ 
ing  of  the  prediction  error. 

7.  Computational  Aspects 

Modem  computer  algorithms  for  canonical  correlation  analysis  use  a  standard  singu¬ 
lar  value  decomposition  to  compute  the  generalized  singular  value  decomposition  (23) 
with  A  =  lyy  by  first  finding  square  root  factors  of  and  A  ,  and  then  doing  a  standard 
singular  value  decomposition  on  A  =  22iy(A_1  2)  =  QSRT  where  QQT  =  /  =  RRT  and  S 

is  diagonal.  Then  the  generalized  singular  value  decomposition  (23)  is  given  by 
J  —  2,  L  =  Rt A-12  and  D  =  S  .  Thus  the  joint  orthonormalization  of  X  and  Y  in 

the  norms  and  A  to  give  the  canonical  covariance  structure  D  is  very  naturally  viewed 
as  a  generalized  singular  value  decomposition  both  in  terms  of  the  simple  reduction  dis¬ 
cussed  in  Section  2  as  well  as  the  actual  computational  algorithms.  This  can  be  deter¬ 
mined  computationally  using  a  standard  singular  value  decomposition  which  is  numerically 
very  accurate  and  stable  as  compared  with  the  earlier  eigenvalue  computational  methods 
(Bjorck  and  Golub,  1973).  An  open  topic  is  the  investigation  of  numerical  methods  that 
directly  compute  the  generalized  SVD  rather  than  transforming  the  problem  to  the 
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standard  SVD.  Such  a  direct  approach  may  have  better  overall  numerical  accuracy. 

A  second  problem  is  specified  in  terms  of  the  observed  data  given  as  N  repeated 

observations  (X x,  .  .  .  ,XN)  =  C  and  (f1 . yw)=DonX  and  Y  respectively.  The  usual 

sample  covariances  are  computed  as  =  CCT  ,  2^  =  CDr  and  2W  =  DDT  which 

mathematically  are  used  in  the  generalized  singular  value  decomposition.  Numerically, 
however,  the  formation  of  these  products  defining  the  sample  covariances  results  in  a 
halving  of  the  numerical  precision  of  the  computation.  In  the  case  of  given  data,  Bjorck 
and  Golub  (1973)  give  computational  procedures  that  avoid  these  squaring  operations  and 
operate  directly  on  the  observed  data. 

Another  computational  aspect  that  may  have  a  considerable  effect  upon  statistical 
computing  in  the  future  is  parallel  computers.  A  very  efficient  algorithm  for  computing 
the  singular  value  decomposition  has  been  recently  devised  for  highly  parallel  systolic 
arrays  by  Brent  and  Luk  (1985).  Such  an  nX/i  square  array  of  processors  requires  com¬ 
munication  between  only  the  nearest  neighbor  processors  in  synchrony  with  the  processor 
computational  cycle.  The  computation  of  a  singular  value  decomposition  of  a  n'Xn  matrix 
using  a  nxn  array  of  processors  requires  only  order  n  processor  cycles  as  compared  to 
order  n  cubed  for  a  serial  computer  with  a  single  processor.  Such  parallel  processors  and 
algorithms  could  make  routine  the  analysis  of  very  large  sets  of  variables  such  as  arise 
naturally  in  multivariate  time  series  (Larimore,  1983). 

From  remarks  above,  it  is  obvious  that  the  optimal  solution  to  minimizing  the  qua¬ 
dratic  prediction  error  measure  (22)  has  exactly  the  same  structure  as  solving  the  "pseudo" 
canonical  correlation  analysis  problem  using  singular  value  decomposition  methods  with 
2^,  in  (1.1)  replaced  by  A  .  Although  the  matrix  (1.1)  is  no  longer  a  covariance  matrix,  a 
formal  application  of  canonical  correlation  analysis  indeed  gives  the  optimal  solution  to 
minimizing  (22).  Thus  a  sufficiently  general  computational  algorithm  can  be  devised 
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which  will  solve  all  of  the  particular  multivariate  problems  described  above.  Available 
algorithms  for  canonical  correlation  analysis  may  not  be  sufficiently  general  if  for  example 
they  assume  that  the  matrix  (2.1)  with  replaced  by  A  is  a  covariance  matrix  or  that  the 
canonical  covariances  are  correlation  coefficients  (  (yj)2  <  1 ). 
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ABSTRACT 

Very  general  reduced  order  filtering  and  modeling 
problems  are  phased  in  terms  of  choosing  a  state  based 
upon  past  information  to  optimally  predict  the  future 
as  measured  by  a  quadratic  prediction  error  criterion. 
The  canonical  variate  method  is  extended  to  approxi¬ 
mately  solve  this  problem  and  give  a  near  optimal  reduced- 
order  state  space  model.  The  approach  is  related 
to" the  Hankel  norm  approximation  method.  The 
central  step  in  the  computation  involves  a  singular 
value  decomposition  which  is  numerically  very 
accurate  and  stable.  An  application  to  reduced-order 
modeling  of  transfer  functions  for  stream  flow 
dynamics  is  given. 

1 .  Introduction 

Many  complex  random  phenomena  are  modeled  as 
high  order  of  infinite  order  Markov  processes.  Often, 
however,  most  of  the  behavior  of  interest  can  be 
adequately  approximated  by  a  Markov  process  model  of 
much  lower  order.  Many  of  the  modeling,  control,  and 
filtering  methods  depend  upon  a  Markov  or  state-space 
structure.  Even  implementation  of  the  general  Wiener 
filter  theory  often  requires  use  of  finite-order  state 
devices.  Thus,  it  is  frequently  necessary  to  reduce 
a  complex  process  to  a  limited  number  of  states  at 
some  point  in  the  analysis  or  implementation.  In 
this  paper,  the  problem  of  modeling  or  filtering  with 
a  restricted  order  state-space  is  addressed  with  empha¬ 
sis  on  how  best  to  determine  approximate  models  or 
filters  when  the  state  order  is  restricted. 

There  have  been  a  number  of  papers  dealing  with 
reduced-order  modeling,  filtering  and  system 
identification.  Here  we  review  only  those  related 
to  the  canonical  variate  approach,  with  more  technical 
details  contained  in  the  appropriate  sections.  The 
theory  of  canonical  correlations  and  variables  was 
developed  independently  by  Hotelling  (1936)  and  Obukhov 
(see  Gelfand  and  Yaglom  (1959)).  The  solution  of  the 
canonical  variate  problem  was  first  reduced  to  finding 
the  eigenvectors  of  several  symmetric  matrices 
(Hotelling  (1936),  also  see  Anderson  (1958)).  A 
more  computationally  efficient,  numerically  accurate 
and  stable  method  was  developed  by  Golub  (1969)  based 
upon  the  singular  value  decomposition  of  a  matrix. 

Gelfand  and  Yaglom  (1959)  generalized  the  canon¬ 
ical  variate  method  to  describe  the  correlation 
structure  between  two  discrete-  or  continuous -time 
random  processes  on  possibly  different  time 
intervals.  They  expressed  the  mutual  information 
between  two  such  random  processes  simply  in  terms  of 
the  canonical  correlations  (see  Section  5) .  Yaglom 
(1970)  considered  the  relationship  between  the  past 
outputs  of  a  process  and  the  future  outputs  (or 
any  two  disjoint  intervals)  and  has  shown  there  are  a 
finite  number  of  nonzero  canonical  correlations  if 
and  only  if  the  process  has  a  rational  power 
spectrum,  i.e.,  is  a  finite  order  Markov  process. 

Using  a  canonical  variate  analysis  between  the 
past  and  future  of  a  discrete  time  stochastic  process, 
Akaike  (1975)  constructed  a  minimal  realization 
procedure  for  Markov  processes.  This  resulted  In  a 
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stochastic  minimal  realization  algorithm  similar  to 
the  algorithm  of  Ho  and  Kalman  (1963),  Later,  Akaike 
(1974a)  gave  an  abstract  (coordinate  free)  descrip¬ 
tion  of  the  projection  of  the  future  of  a  process  on 
the  past  and  called  it  the  predictor  space.  The 
canonical  variate  realization  provides  a  particular 
basis  for  the  predictor  space.  He  used  the  concept 
of  the  predictor  space  to  characterize  any  minimal 
realization  for  a  discrete  time  Markov  process  as  a 
particular  choice  of  basis  for  the  predictor  space. 

The  predictor  space  concept  has  been  widely  used  in 
stochastic  realization  theory  (Clary  (1977), 

Fujishige  et  al .  (1975),  Picci  (1976)). 

Fujishige  et  al.  (1975)  addresses  the  reduced 
order  modeling  problem  using  the  predictor  space,  but 
they  do  not  use  the  canonical  variate  structure  for 
model  reduction.  The  criterion  they  define  is  the 
sum  square  prediction  error  of  all  output  components 
for  all  the  future  which  is  a  special  case  of  the 
prediction  error  criterion  discussed  in  Section  3. 

Their  procedure  requires  an  initial  state-space  model; 
however.  It  results  In  needing  only  to  solve  for 
eigenvectors  of  a  symmetric  matrix  whose  dimension  is 
the  original  system  state  order  -  a  very  small  amount 
of  computation  compared  with  most  reduced  order 
modeling  schemes.  A  very  interesting  but  brief 
discussion  of  canonical  variate  and  predictor  space 
methods  and  their  relation  to  filtering  problems  is 
given  by  Kailath  (1974).  He  talks  about  an  approxi¬ 
mation  problem  and  the  possible  usefulness  of 
canonical  variates,  but  he  does  not  explicitly  discuss 
a  reduced-order  filtering  problem. 

The  minimal  splitting  field  of  past  and  future  is 
the  continuous-time  analog  to  the  predictor  space  and 
predates  Akaike ' 3  work  although  he  was  the  first  to 
propose  a  realization  algorithm.  The  methods  involve 
abstract  Hilbert  spaces  to  accommodate  continuous 
time  processes  (Levinson  and  McKean  (1964),  McKean 
(1963),  Pitt  (1972),  Rozanov  (1976),  (1977)). 

The  optimal  Hankel  norm  approach  of  Adamjan 
et  al.  (1978)  has  received  much  recent  attention 
in  reduced  order  modeling  (see  Kung  and  Lin  (1981) 
and  cited  references).  Canute  and  Menga  (1982) 
discuss  relationships  between  the  canonical  variate 
approach  of  Akaike  and  the  optimal  Hankel  norm 
approach . 

2.  Approach 

A  major  departure  of  this  paper  from  previous  work 
is  the  use  of  canonical  variate  analysis  to  optimally 
choose  k  linear  combinations  of  the  past  for  prediction 
of  the  future.  The  very  natural  measure  of  quadraticly 
weighted  prediction  errors  at  possibly  all  future  time 
steps  is  used.  In  Section  3  we  formulate  the  problem 
and  show  how  a  generalized  canonical  variate  analysis 
problem  solves  it  explicitly.  The  interpretation  of 
canonical  variates  as  optimal  predictors  is  central 
in  motivating  interest  in  such  a  problem  formulation 
and  is  scarcely  found  in  the  statistical  literature. 

The  optimal  k-order  predictors  are  not  in  general 
recursively  computable,  but  the  optimal  state-space 
structure  for  approximating  them  is  expressed  simply 
in  terms  of  the  canonical  variate  analysis.  The 
problem  of  finding  an  optimal  Hankel  norm  reduced 
order  model  is  related  to  the  canonical  variate  approach 
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3 .  Statement  of  the  Problem 


Consider  the  problem  of  choosing  an  optimal 
system  ormodel  of  specified  order  for  use  in  pre¬ 
dicting  the  future  evolution  of  the  process.  We  will 
distinguish  between  the  past  p(t)  of  one  vector  pro¬ 
cess  r(t)  at  time  t  or  before  and  the  future  f(t)  of 
another  vector  process  s(t)  at  time  later  than  t  so 

pT(t)  -  (rT(c).  rT(t-l)  , . . . )  (3-1) 

fT(t)  -  (sT(c+l).  sT(t+2),...)  (3-2) 

tfe  assume  that  the  processes  r(t)  and  s(t)  are  jointly 
stationary. 

The  major  interest  is  in  determining  a  specified 
number  k  of  linear  combinations  of  the  past  p(t) 
which  allow  optimal  estimation  of  the  future  f(t). 

Any  set  of  k  linear  combinations  of  the  past  p(t)  are 
denoted  as  a  kxl  vector  m( t) ,  memory  of  the  past  of 
order  k.  The  optimal  linear  prediction  f(t)  of  the 
future  f(t)  which  is  a  function  of  a  reduced  order 
memory  m(t)  is  measured  in  terms  of  the  prediction 
error 

E ; ! f  -  i  ,2_l  •  EC (f  -  f)T  9'1  (f  -  £)]  (3-3) 

-1  ^ 

where  9  is  an  arbitrary  quadratic  weighting  and  E  is 
the  expectation  operation.  The  reduction  problem  is 
to  determine  an  optimal  k-order  memory 

m(t)  -  J^p(t)  (3-4) 

for  which  the  optimal  linear  predictor  f(t,  x(t)) 
minimizes  the  prediction  error. 


extension  of  the  classical  canonical  variate  analvsis 
method.  The  derivation  of  this  extension  is  rather 
lengthy  and  will  be  described  elsewhere. 

In  the  statistical  literature,  the  canonical 
variate  problem  is  delt  with  as  one  of  maximizing 
correlation  between  two  sets  of  variables  (i.e., 
p(t)  and  f  ( t ) ) ;  whereas  our  interpretation  will  be 
choosing  variables  from  p(t)  that  optimally  predict 
f(t),  which  is  rarely  the  conceptual  framework  used 
in  statistics. 

We  treat  here  explicitly  the  case  of  finite  past 
and  future,  i.e.f  p(t)  and  f(t)  of  finite  dimension, 
to  avoid  the  technicalities  of  the  infinite  dimen¬ 
sional  case  which  is  discussed  in  detail  in  Gelfand 
and  Yaglom  (1959) . 

4 . 1  Canonical  Variate  Solution 

The  solution  to  the  canonical  variate  problem  is 
expressed  quite  simply  by  putting  the  covariance 
structure  of  past  p(t)  and  future  f(t)  in  a  canonical 
form.  We  seek  nonsingular  transformations  of  p  and  f 

c  =*  Jp,  d  »  Lf  (4-1) 

such  that  in  this  new  basis  the  norm  (3-3)  for  weighting 
prediction  errors  of  the  future  is  a  sum  of  squares 

||f||2  .  -  dT(L8LT)*1d  =■  dTd  (4-2) 
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In  addition  the  covariances  among  the  past  and  between 
the  past  c  and  future  d  have  a  canonical  structure 

cov  (c , c )  -  I,  (4-3) 


In  various  particular  problems,  the  process  r(t) 
of  the  past  will  include  outputs  of  a  system  and/or 
inputs  of  a  system.  The  process  s(t)  of  the  future 
may  be  the  same  as  r(t)  or  different.  The  general 
case  of  interest  is  the  reduced  order  filtering  and 
modeling  problem:  given  the  past  of  the  related  ran¬ 
dom  processes  u(t)  and  y(t) ,  we  wish  to  model  and 
predict  the  future  of  y(t)  by  a  k-order  state-space 
structure  of  the  form 


x 


c+1 


*xr  +  Gut  +  w^_ 


(3-5) 


v  »Hx  +Au„  +  Bw.  +v„ 
-  t  t  t  t  t 


(3-6) 


cov  (c »d)  “  DiagCYji . . . ,7p *0, . . .0)  -  D  (4-4) 

with  the  canonical  covariances  Y^>. .  .>Y?>0  in  descending 

order.  Thus,  the  components  of  the  past  c  are  mutually 
uncorrelated.  Of  all  linear  combinations  of  p  and  f, 
the  first  component  of  c  has  maximum  covariance  with 
the  first  component  of  d. 

It  can  be  shown  that  for  any  order  k,  that  the 
first  k  components  of  c,  i.e.,  corresponding  linear 
combinations  of  the  past  p,  lead  to  the  best  prediction 
f  of  the  future  f.  The  optimal  choice  of  a  k-order 
memory  is  then 


where  w  and  v  are  white  noise  processed  that  are 
independent  with  covariance  matrices  Q  and  R  respect¬ 
ively.  These  white  noise  processes  model  the  co- 
variance  structure  of  the  error  in  predicting  y  from  u. 
A  special  case  of  the  reduced-order  filtering  problem 
is  the  transfer  function  approximation  problem  where 
u  and  v  are  the  input  and  output  processes  and  an 
approximate  state-space  model  is  desired. 

Once  the  optimal  k-order  memory  m(t)  is  determined, 
we  will  develop  state-space  equations  for  approximately 
computing  the  memory  or  recursively  describing  its 
evolution.  A  major  part  of  the  problem  is,  however, 
the  choice  of  the  optimal  k-order  memory. 

4 .  Janonical  Variates  as  Optimal  Predictors 

In  this  section  the  solution  to  choosing  the 
optimal  k-order  memory  is  described  in  terms 
of  the  canonical  variate  analysis  method  of  mathe¬ 
matical  statistics.  Here  the  solution  is  explicitely 
described  in  terms  of  a  singular  value  decomposition. 

To  treat  the  prediction  problem  of  section  3  involving 
an  arbitrary  9  in  the  prediction  error  (3-3)  requires 


\  -  V  (V0)JP 


(4-5) 


The  minimized  prediction  error  for  order  k  is  simply 
expressed  in  terms  of  the  canonical  covariances  as 


min 

J, 


Ilf 


tr 


3-1- 
'  "ff 


The  sum  of  squared  canonical  covariances  Y 


2 

k+1 


+  . 


(4-6) 
■  +Y2 


corresponding  to  the  neglected  variables  gives  the 
increased  prediction  error  from  using  memory  order  k 
rather  than  H. 


'ua  x  l  iua 


4.2  Calculations  Using  Singular  Value  Decomp 

The  requirements  of  (4-2)  through  (4-4)  are  equiva¬ 
lent  to  finding  J  and  L  such  that 


JI  ,L 

pf 


j  i  j 

pp 


Diag(y  , . . .  ,y,  ,0. . .0) 

I  K 


(4-8), 


L  9  L 


D 

I 


(4-7) 


’ (4-9) 
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H  -  USVT,  UUT 


I 


(4-19) 


This  is  easily  accomplished  using  a  singular  value 
decomposition  (Golub  (1969))  which  is  computationally 
very  efficient  and  numerically  very  accurate  and 
stable.  Dimensions  of  p(t)  and  f(t)  as  high  as  sev¬ 
eral  hundred  can  be  handled  efficiently  and 
accurately  using  these  computational  techniques. 


To  find  this  decomposition,  first  the  square 

roots  and  8^  are  computed  by  either  a  Cholesky 
PP 

procedure  or  an  eigenvector  procedure.  Since  a 
singular  value  decomposition  procedure  is  used 
latter  and  is  numerically  much  more  accurate  and 
stable,  it  can  be  used  to  find  the  eigenvalues 
and  eigenvectors 
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“PP 


T 

ViV 


U2S2V2 


(4-10) 


where  and  U2*V^  are  matrices  of  the  eigenvectors, 

S,  and  are  diagonal  and  contain  the  eigenvalues, 

and  the  equality  of  the  U's  and  V’s  follows  since 
and  9  are  positive  definite  and  symmetric.  The 

~  PP 

square  roots  are 


~PP 


U  s”*5  V  ^ t 
U1  bl  1  ’ 


(4-11) 


Now  form  the  matrix 


and  do  a  singular  value  decomposition 


(4-12) 


-  I, 


W 


T 


where  S  is  diagonal  with  nonnegative  singular  values 
in  nonincreasing  order.  A  property  of  the  singular 
value  decompositions  is 

inf  ' | l»  -  A||  -0  ,  (4-20) 

A  :  rank(A)  <  k 

As  shown  in  Adamjan  et  al  (1978),  this  bound  is 
achieved  for  A  restricted  to  a  Hankel  matrix  for 
single  input -output  systems,  and  is  at  least  a  lower 
bound  for  multivariable  systems-  This  solution 
is  a  minimax  solution  -  for  a  given  approximation  A 
the  norm  (4-17)  measures  the  largest  possible  error 
in  output  future  sequences  f  over  all  possible  past 
sequences  p  with  | |pj  |*1.  Tor  finite  order  systems, 
the  exponential  decay  of  the  impulse  response  will 
cause  the  worst  sequence  to  be  concentrated  near  the 
origin.  This  is  a  very  atypical  input  sequence  to  use 
as  a  basis  for  measuring  closeness 

By  contrast,  in  the  canonical  variate  formulation, 
the  norm  is 

E | j  f h  -  f Al 1 2  -  tr(H-A)(H-A)T  (4-21) 

-  1 1 h-a| | 2 

the  Frobenius  norm.  From  (4  -  6)  this  norm  has  the 
lower  bound 

inf  |  |H-A|  |f  -  (\+1+----W^)15  (*-22) 

A  :  rank (A)  *  k 


A  -  USVT,  UTU  -  VT  V  -  I 


(4-13) 


where  S  is  diagonal  with  nonnegative  elements  in 
descending  order 

S  -  dia  (s^  _>  s2  sft)  (4-14) 

The  canonical  variate  decomposition  is  obtained  by 
setting 

J  *  UT  Z**5,  L  -  VT  D  -  S  (4-15) 

PP 

4.3  Relationship  to  the  Hankel  Norm 

Consider  the  deterministic  input-output  case 
which  can  be  cast  in  the  canonical  variate  framework 
by  choosing  a  white  noise  stochastic  input  so  1^*1 

and  letting  3-1.  The  covariance  matrix  Z  is 


(4-16) 


where  the  Hankel  matrix  H  involves  the  impulse  response 
matrices  The  Hankel  norm  between  H  and  an 

approximation  A  is  defined  as 


H  -  A 


(4-17) 


where  1  I  1  is  the  spectral  norm 
s 


max 

lx!!  -  i 


Bx!  ! 


2’ 


Zx‘  (4-18) 


A  fundamental  difference  here  is  that  the  norm  (4-21) 
measures  the  average  overall  output  sequences  result¬ 
ing  from  random  input  sequences  with  unit  average  power, 

o  2 

i.e.,  E(u(t) ) ^“1  for  t<0;  whereas  for  (4— 20) , (t ) ) *1. 

Camuto  and  Menga  (1982)  discuss  some  relationships 
between  the  canonical  correlation  and  Hankel  norm 
approaches.  There  is  no  interpretation  of  the 
canonical  correlations  as  minimizing  a  norm  as  in 
Equation  (4-6),  and  further  they  note  that  the 
singular  values  in  the  two  approaches  do  not  coincide. 
They  conclude  that  ’’because  they  (the  canonical 
correlations)  do  not  have  any  practical  significance 
about  the  energetic  structure  of  the  dynamics  of  the 
process,  the  properties  of  the  resulting  approximated 
models  are  not  clear”.  The  present  canonical 
variate  approach  makes  clear  that  the  energetic 
structure  of  the  dynamics  is  better  accounted  for  in 
the  prediction  error  measure  than  it  is  by  the 
Hankel  norm. 

4 . 4  Related  Literature 


In  the  classical  canonical  correlation  analysis 
(Hotelling  (1936),  Anderson  (1958)), 9  so  that 

the  prediction  errors  of  the  future  are  weighted  by 
their  inverse  covariance  matrix,  and  consequently 
the  future  d  is  normalized  to  have  identity  covariance 
matrix.  Also  the  canonical  covariances  are  then 
correlation  coefficients.  The  traditional  criterion, 
to  the  extent  that  there  has  been  such  discussions, 
has  concerned  the  mutual  information  (Shannon  and 
Weaver  (1962))  in  one  random  vector  p  about  another 
random  vector  f  defined  by 

Ppf(P.f> 

J(p;f)  - J  Ppf(p.f>  log  Pp(p)  Pf(?T  dP  df  (4'23) 


Now  consider  a  singular  value  decomposition  of  H 


B-4 


The  base  of  the  logarithm  is  arbitrary  and  determines 
the  particular  units  of  information,  and  is  the 

joint  and  p  and  pf  the  marginal  probability  den¬ 
sities  . 


Gelf and  and  Yaglon  (1959)  showed  that  the 
mutual  information  is  simply  expressed  in  terms  of 
the  canonical  correlations  Y1»*-**Yn  between  the  two 
vectors  by 

n  2 

J(p;f)  -  -4  I  log(l  -  Y.)  -  -%log  w  (4-24) 

j-1  3 

where  Hotelling  (1936)  defines  the  vector  alineation 
coefficient 

w  -  (1  -  Yj)  ...  (1  -  y\)  (4-25) 


as  a  measure  of  independence  of  p  and  f.  Gelfand  and 
Yaglom  (1959)  extend  the  definition  of  mutual 
information  to  vectors  of  infinitely  many  random 
variables,  e.g.»  random  processes  in  both  continuous 
time  and  discrete  time.  This  development  also 
provides  the  basis  for  extending  canonical  variates 
to  random  processes  (Yaglom  (1970)), 

Now,  if  a  restricted  number  k  of  linear  combin¬ 
ations  (c1*-..,c^)  of  the  past  of  one  random  process 

r(t)  are  used  to  predict  the  future  of  another  random 
process  s(t),  then  the  choice  maximizing  the  mutual 
information  is  the  first  k  canonical  variates  and 
the  mutual  information  is  expressed  by  the  first  k 
canonical  correlations 

a1,”aX,ak  J(cl,”*’ck{f>““,sl081^l<l"Yj)  (4-26) 

Thus,  the  canonical  correlation  method  provides  an 
optimal  procedure  in  terms  of  mutual  information  for 
choosing  a  finite  number  of  linear  combinations  of 
one  random  process  for  prediction  of  another. 

Recently,  in  the  statistical  literature  Yohai 
and  Garcia  Ben  (1980)  point  out  the  use  of  canonical 
variates  as  optimal  predictors.  They  show  that  the 
canonical  variates  (with  in  our  scheme) 

minimize  the  prediction  error 

! E(f  -  f)(f  -  f)T|  (4-27) 

where  denotes  determinant,  and  the  minimum  value 

is 

::ff|(l  -  Yj)...(l  -  y£).  (4-28) 

The  logarithm  of  this  expression  is,  within  a  fixed 
additive  constant,  proportional  to  the  minimized  mutual 
information  (4-26).  Rao  (1973)  gives  a  problem  in 
which  the  canonical  variates  for  the  measure 

E  | f  -  f ] 1 2  (4-29) 

are  to  be  determined  (9»I  in  our  scheme)  which  gives 
a  different  solution  from  the  classical  canonical 
correlation  problem  (S*"^ )  .  The  general  canonical 

variate  problem  using  a  general  prediction  error 
measure  (3-3)  was  formulated  and  solved  in  Porter  and 
Larimore  (1974)  for  a  nondynamical  problem.  This 
was  first  applied  to  reduced  order  modeling  for  the 
canonical  correlation  case  and  no  input  u(t)) 

in  Larimore  et  al.  (1977).  The  general  prediction 
error  measure  (arbitrary  9)  was  considered  for 
deterministic  impulse  response  modeling  in  Goldstein 
and  Larimore  (1980). 


5 .  State  Space  Realizations 


As  discussed  in  Section  3,  there  are  a  variety 
of  problems  of  interest  including  reduced-order 
stochastic  modeling  and  filtering.  The  most  general 
form  is  the  state  space  model 


x 


t+1 


+  Gut  +  w^ 


(5-1) 


v  *  Hx  +  Au  +  Bw  +  v  (5-2) 

t  t  y  t  t 

where  is  an  input  process,  xc  is  the  state  vector, 
is  white  process  noise  with  covariance  matrix  Q, 
and  v^  is  white  measurement  noise  uncorrelated  with 
with  covariance  matrix  R.  It  has  been  shown 
(Lindquist  and  Pavon  (1981)),  that  for  no  input  u^, 

the  form  (5-1)  and  (5-2)  is  the  most  general  state- 
space  realization  of  a  Markov  process,  and  that  the 
state  dimension  is  equal  to  the  Markov  order.  Other 
Markov  realizations  as  in  Akaike  (1975)  and  Baram 
(1981)  which  have  A-B»G*R«0  are  not  the  most  general 
and  may  require  much  higher  state  order  for  a  suitable 
approximation.  These  latter  forms  are  particularly 
inefficient  in  the  presence  of  moving  average  terms 
or  additive  white  measurement  noise.  As  will  be 
seen  below,  a  regression  interpretation  of  the 
state-space  matrices  makes  it  clear  that  the  error 
in  regression  is  ignored.  A  further  point  of 
Lindquist  and  Pavon  (1981)  is  that  for  a  parsimonious 
state  defined  by  Che  predictor  space,  the  past  and 
future  must  be  nonoverlapping  as  in  (3-1)  and  (3-2)  . 


For  the  purely  stochastic  case  with  u(t)-0,  let 


r (t)  -  y (t) ;  s(t)  *  y(t) 


(5-3) 


in  setting  up  the  past  p  and  future  f  as  in  (4-1) . 

For  the  case  of  a  deterministic  input  u(t)  ,  the  input 
must  also  be  included  in  the  past  so  that 

rT(t)  «  (u  ( t ) ,  y  ( t) ) ;  s(t)*y(t)  (5-4) 

Another  case  is  the  deterministic  input-output  system 
with  no  process  and  measurement  noises  w^  and  v  so 

r (t)  *  u(t),  s(t)  »  y ( t)  (5-5) 

Note  that  in  the  case  of  a  known  input  u(t)  present, 
the  covariance  function  of  u(t)  is  required  and  used 
in  specifying  the  most  important  components  of  the 
past  of  u(t)  to  include  in  the  state  for  prediction 
of  the  future  of  y(t) . 

Now  for  a  given  order  k  for  a  model,  we  wish  to 
find  a  best  k-element  state.  This  is  equivalent  to 
finding  the  k  linear  combinations  of  the  past  p 
which  have  the  best  ability  to  predict  the  future  f 
and  which  are  also  computable  recursively  in  time. 

The  optimal  k  linear  combinations  m(c)  given  by  the 
canonical  variate  analysis  (4-7)  through  (4-9)  is 
not  generally  recursively  computable  except  when  k 
is  the  full  order  &.  To  approximately  find  the  best 
k-order  state  x(t) ,  the  following  procedure  is  used. 
First  find  the  optimal  k-order  memory  from  the 
canonical  variate  analysis  for  any  k<2,  with  the 
minimal -order  realization  given  for  kml.  Considering 
k  fixed  below,  we  have 

m(t)  -  J^p(t)  where  J^CI^OJJ  (5-6) 

with  I,  the  kxk  identity, 
k 
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To  determine  a  state  x(t)  satisfying  the  recursive 
relations  (5-1)  and  (5-2)  which  approximately  gives 
the  optimal  k  order  memory  m(t) ,  the  optimal  predic¬ 
tion  of  m(t+l)  and  y(t)  using  m(t)  and  u(t)  is  deter¬ 
mined  from  simple  regression  relationships. 


The  dynamical  equations  (5-1)  and  (5-2)  express 
(x  ,  y  )  as  a  linear  combination  of  (x^.  ,  u^)  plus 
white  noise  vector  with  correlated  components.  Thus 
using  simple  multivariate  regression  procedures 
(Anderson  (1958)),  the  matrix  for  optimal  prediction 
of  (m  ^ ,  y  )  from  (mt ,  u^)  is 


a 


and  the  error  in  prediction  has  covariance  matrix 


(5-8) 

The  matrices  Q,  R,  and  B  are  simply  expressed  in  terms 

of  S  by 


Q 

B 

R 


s11 

b21  bll 

b22  zl  11  12 


(5-  9) 


where  (+)  denotes  the  pseudo inverse . 


Explicit  computation  of  the  covariance  matrices 
is  obtained  using  the  decomposition  and  the  covariance 
of  p,  f,  y  and  u  as 


L_  _J 


This,  then,  gives  the  covariance  matrices  explicitly 
in  terms  of  the  covariance  functions  involving  u(t) 
and  v(t).  In  the  purely  stochastic  case  that  there  is 
no  input  u(t)  present,  the  components  of  u  in  the 

T  T  C 

vector  (m  ,  u^)  are  deleted  and  G  and  A  are  then  not 

computed.  For  the  deterministic  case  where  w  and  v^ 

are  zero  for  the  full  order  realization,  it  may  be  of 
use  in  some  reduced-order  modeling  problems  to  compute 
Q,3,  and  R  to  give  a  reduced-order  model.  The  reduced- 
order  model  (5-7)  of  a  stable  system  can  be  shown 
(Fujishige  et.  al.  (1975))  always  to  be  stable. 

The  existing  literature  on  the  use  of  the  canon¬ 
ical  variate  method  in  deriving  the  above  reduced-order 
state-space  models  is  based  upon  Larimore  et  al  (1977). 
Baram  (1981)  and  Koehler  (1981)  describe  a  restricted 
stochastic  modeling  problem  using  the  canonical 
correlation  approach  (9-Z^,  no  input  u(t),  no  mea¬ 
surement  noise  v(t) ,  and  no  prediction  error  interpre¬ 
tation)  essentially  as  it  was  presented  in  Larimore 
et  al  (1977).  White  (1983)  gives  a  more  recent 


development  and  full  acknowledgement  for  the  origin 
of  these  ideas. 

6 .  Example  in  Impulse  Response  Modeling 

In  a  recent  study  (Goldstein  and  Larimore 
(1980))  for  the  National  Weather  Service,  the  can¬ 
onical  variate  procedure  was  used  for  deriving 
reduced-order  state-space  models  of  stream-flow 
dynamics.  This  was  a  necessary  component  in  that 
study  which  investigated  the  application  of  Kalman 
filtering  and  maximum  likelihood  parameter  identifi¬ 
cation  to  hydrologic  forecasting. 

The  problem  is  formulated  in  terms  of  a  given 
unit  hydrograph  h(7)  that  specifies  the  response  at 
lag  t  to  a  unit  pulse  input  at  time  zero.  It  is 
desired  to  find  a  state-space  model,  preferably  of 
low  order,  which  is  a  good  approximation  in  some 
sense  to  the  given  unit  hydrograph.  This  problem 
cannot  be  separated  from  the  characteristics  of  the 
input  process  since  the  modes  of  h(l)  that  are 
excited  and,  hence,  the  output  depend  strongly  upon 
the  input  process.  Nominally,  it  will  be  assumed 
that  the  input  process  is  white  noise  which  excites 
all  frequencies  proportionately.  If  the  typical 
input  signal  power  spectrum  is  known  and  different 
from  white  noise,  this  fact  can  be  easily  included 
and  would  lead  to  an  alternative  approximating 
state-space  model.  It  will  be  shown  that  the  white 
noise  assumption  leads  to  excellent  approximations 
of  the  unit  hydrograph  with  low-order  state-space 
models.  A  schematic  description  of  the  problem  is 
shown  in  Figure  1. 


Figure  1  Approximation  of  Unit  Hydrograph 
by  a  Reduced  Order  Filter 

The  reduced-order  state-space  modeling 
described  above  ha9  been  applied  to  unit  hydrographs 
for  a  number  of  river  basins  supplied  by  NWS.  The 
character  of  the  reduced-order  models  is  illustrated 
below  and  described  in  more  detail  in  Goldstein  and 
Larimore  (1980). 

Two  different  weightings  9  of  errors  in 
predicting  the  future  f  were  used,  6*1  giving  a  sum 
squared  error  or  energy  measure  and  9-E^  giving  a 
squared  relative  error  measure. 

The  differences  in  reduced-order  models  obtained 
from  these  two  measures  of  prediction  errors  depend 
very  strongly  upon  the  spectral  shape  of  the  hydro¬ 
graph  transfer  function.  A  striking  comparison  in 
fit  using  the  two  criteria  was  obtained  for  the  Bird 
Creek  basin  which  is  order  14.  The  six-hour  unit 
hydrographs  based  upon  the  input  hydrographs  for  4- 
and  8-state  models  are  shown  in  Figure  2  respectively 
for  the  two  cases  0-1  and  0-E^.  The  respective 

squared  magnitude  transfer  functions  are  shown  in 
Figure  3.  Note  in  Figure  2  that  even  the  8-state 
unit  hvdrograph  from  the  case  5-“^  has  a  significant 

nonzero  tail  whereas  the  4-state  unit  hydrograph  from 
the  case  5-1  produces  an  excellent  fit.  Figure  3a 


0 
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(2b)  Sum  Squared  Error,  9*1 

Figure  2.  Six-Hour  Unit  Hydrographs,  Original  (Solid), 
Eight-Order  (Dashed),  and  Fourth -Order  (Dashed  and 

Dotted) . 


clearly  illustrates  the  tendence  of  the  case  9*1  to 
fit  all  frequencies  with  nearly  equal  percent  error, 
whereas  from  Figure  3b  it  is  seen  that  in  the  case 
9*1  the  frequency  bands  of  highest  energy  are 
emphasized.  Thus,  for  a  hydrograph  with  a  large 
spectral  peak  and  complicated  spectral  shape,  i.e., 
requiring  a  high  order  rational  function  for  a  good 
approximation,  the  case  6-1  can  be  expected  to  excel 
in  fitting  the  unit  hydrograph. 

10  -  Conclusion 

The  canonical  variate  approach  provides  a  power¬ 
ful  and  general  procedure  for  reduced-order  modeling, 
filtering  and  system  identification.  The  procedure 
is  computationally  noniterative  and  incorporates  use 
of  a  singular  value  decomposition  which  is  numeri¬ 
cally  accurate  and  stable.  This  guarantees  a  compu¬ 
tational  solution  in  every  case.  All  reduced-order 
models  are  easily  computed  from  one  singular  value 
decomposition . 

This  paper  extends  the  pioneering  work  of  Aka ike  * s 
in  a  number  of  directions.  A  generalized  canonical 
variate  procedure  is  explicitly  described  in  terms 
of  minimizing  an  arbitrary  quadratic  weighting  of  the 
error  in  prediction  of  the  future  from  the  past.  This 


(3b)  Sum  Squared  Error,  0*1 

Figure  3.  Squared  Magnitude  Transfer  Function, 
Original  (Solid),  Eight-Order  (Dashed),  and  Fourth- 
Order  (Dashed  and  Dotted). 


considerably  extends  the  usefulness  of  the  method. 
While  Akaike  considered  only  the  case  of  process 
noise,  we  include  any  combination  of  inputs,  process 
and  measurement  noise.  This  extends  the  approach  of 
Akaike  to  the  reduced-order  filtering  and  transfer 
function  modeling  problems  as  well  as  modeling  in  the 
presence  of  an  input  function.  The  use  of  a  non¬ 
overlapping  past  and  future  lead  to  lower-order 
state-space  models.  Using  a  finite  past  and  future, 
simple  and  computationally  efficient  expressions 
are  explicitly  given  for  determining  reduced-order 
system  state-space  matrices.  A  particular  specializa¬ 
tion  of  the  canonical  variate  procedure  is  related  to 
the  Hankel  norm  method  for  deterministic  input-output 
systems,  however  the  former  has  an  interpretation  in 
terms  of  the  prediction  error  of  the  future. 


The  example  modeling  river  basin  dynamics 
illustrates  the  flexibility  of  the  general 
quadratic  weighting  of  the  error  in  predicting  the 
future  from  the  past.  The  classical  canonical 
correlation  procedure  leads  to  uniform  fitting  in  the 
frequency  domain  while  the  sum  square  error  criterion 
leads  to  a  uniform  fitting  of  the  unit  pulse 
response . 
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Abstract .  Model  Algorithmic  Control  (MAC)  is  a  relatively  new  design  methodology 
successfully  used  by  industries  for  the  last  several  years.  The  objective  of  this 
paper  is  to  investigate  robustness  properties  of  MAC,  and  evaluate  the  use  of 
adaptive  methods  for  real-time  identification  of  the  plant  under  closed-loop  control. 
Some  theoretical  robustness  properties  of  MAC  are  given  in  terms  of  classical  qualities 
such  as  gain  margin  and  phase  margin  for  a  wide  class  of  systems.  Although  MAC  is  an 
output-feedback  controller,  it  has  a  guaranteed  continuous-time  equivalent  phase  mar¬ 
gin  of  60°,  and  the  upward  gain  margin  can  be  made  arbitrarily  large  by  slowing  down 
the  reference  trajectory.  Some  robustness  properties  of  MAC  are  also  given  by  a 
perturbation  analysis  of  a  miss-modeled  plant  Impulse  response.  Preliminary  results 
are  discussed  for  on-line  identification  of  the  closed-loop  plant  using  the  canonical 
variate  method.  Performance  of  the  identification  of  the  plant  in  the  presence  of 
both  input  and  measurement  noise  is  given. 

Keywords .  Adaptive  control;  identification;  robustness;  canonical  variate  analysis; 
model  algorithmic  control. 


INTRODUCTION 


an  input  u(t)  and  an  output  v(t)  to  be 
controlled.  The  y(t)  are  related  through 
a  convolution  operator(*) 


The  MAC  methodology  generates  a  control 
sequence  by  on-line  optimization  of  a 
cost-functional,  and  the  algorithm  is  suit¬ 
able  for  implementation  on  microprocessors. 
One  of  the  attractive  features  of  MAC  is  the 
clear  And  transparent  relationship  between 
system  performance  and  various  design  para¬ 
meters  embedded  in  the  design  procedure. 

MAC  has  been  described  elaborately  in  the 
literature  (Mehra  et.  al.  (1977,  1979,  1980), 
Mereau  et.  al.  (1978),  Richalet  et.  al. 
(1978),  and  Rouhani  and  Mehra  (1982)),  and 
therefore  only  a  brief  description  of  MAC  is 
given  below.  The  z-transform  or  s-transforra 
of  a  time  function  is  denoted  by  replacing 
the  time-argument  by  z  or  s  respectively; 
for  example  v(z)  denotes  the  z-transform  of 
y(n) .  For  the  sake  of  simplicity  a  single- 
input  single-output  system  is  considered 
although  the  extension  to  multiinput  multi¬ 
output  plants  is  conceptually  straight- 
fo rward . 


or , 


(ii)  A  model  cf  the  plant  h(t)  *  ,h^,i  =1,...N 
with  output  y(t)  and  input  u(tj  so  that 


or , 


y(t)  =  h  ( t )  *  u ( t )  -  Ih.u(t-i) 


y(z)  =  h(z)u(z),  h(z)  *  -  tuz  ^  (1.1) 


N 

v(t)  *  1  h.u(t-i) 

.  .  i 


v(z)  =  h(z)u(z),  n(z)  *  1  h^z  (1.2) 


i-1 


There  are  five  basic  elements  in  MAC; 


(iii)  A  smooth  trajectory  y  (t)  initiated  on 


r 


(i)  .An  actual  plant  with  a  casual  pulse 
response  function  h(t)  =  '.h^f  i  =1,...N/, 


the  current  output  v(t)  that  leads  y(t)  to  a 
possibly  time  varying  set  point  c.  The  y^(t) 


1  This  work  was  supported  by  the  Air  Force 
Wright  Aeronautical  Laboratory. 
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evolves  as  follows: 


(z-l)h(z)u(z)  +  <l-30h(z)u(z) 


\ 


yr(t+l)  -  ayr(t)  +  (l-a)c(t) ,  y/t)  -  y(t)  (1.3a) 


(l-J)c(z) 


(1.7) 


yr(z) 


xz  ^y(z) 


+  ( 1  —  jt) z  h  (z) 


(1.3b) 


By  further  manipulation  (1.7)  can  be  express- 
ed  as 


where  l  is  a  constant  determining  the  speed  of 
response ; 

( iv)  a  closed  loop  prediction  scheme  for  pre¬ 
dicting  the  future  output  y  (t)  of  the  plant 

according  to  the  scheme 

y  (t+1)  *  y(t+l)  +*  y(t)  -  y(t)  (1.4a) 

y  (z)  =  y(z)  +  z  l(y(z)  -  y(z))  (1.4b) 

P 


and  finally 


(v)  a  quadratic  cost  functional  J  based  on 

the  error  between  y  (t)  and  v  (t)  over  a  finite 
P  '  r 


horizon  T: 


J 


1  [  (y  (t+i)-y  (t+i))2  w(i) 

i-1  P 

+  u“ ( t+i-l)r(i-l))  ] 


(1.5) 


where  w(i)  and  r(i)  are  time  varying  weights. 
Usually  r(i)  is  chosen  to  be  zero. 


u(z)  _  l-jt _  / *  ,,  . 

c(z)  *  (z-l)h(z)  +  (l-Vh(z)  J 

y(2)  ^  _ h( z)  ( 1 — a) _  , . 

c(z)  (z-l)h(z)  +  (l-u)h(z) 

Equations  (1.8)  imply  that  MAC  under  assump¬ 
tions  (i)-(iii)  above  is  equivalent  to  the 
following  classical  unity  feedback  configura¬ 
tion  in  an  input-output  sense. 


Fig.  1.  MAC  as  a  Classical  Controller 

This  interpretation  of  MAC  is  the  basis  of 
our  analysis  of  MAC  in  terms  of  classical 
control . 

PHASE  AND  GAIN  MARGINS 


Given  (i)  -  (v) ,  MAC  finds  an  optimal  control 
sequence  (u*(t+i-l),  1*1,... T-l  by  minimizing 
J  over  the  admissible  input  sequence 
iu(c+i-l)SiH  i) ,  1  *  1 ,  . .  . T - 1 ; .  Once  the 

optimal  control  sequence  is  computed,  the 
first  element  of  the  sequence  is  applied  to 
the  actual  plant  and  the  process  repeats  all 
over  again. 


The  block  within  the  dashed  line  can  be 
considered  as  a  dynamic  controller  of  the 
classical  type.  The  loop  transfer  function 
at  point  1  is 


L(z) 


h  (  z )  (  M) 

h(z) (z-1) 


(2.1a) 


To  investigate  the  theoretical  properties  of 
MAC  and  to  interpret  MAC  from  the  classical 
control  viewpoint  we  make  the  following 
assumptions : 


and  the  return  difference  function  is 


l+Uz) 


n(z)(z-l)  +  h(z)(l-:Q 

h(z) (z-1) 


(2.1b) 


(i)  the  actual  plant  h(z)  is  minimum  phase;  The  error  e(z)  *  c(z)  -  y(z)  in  tracking  is 

given  by 

(ii)  there  are  no  input  constraints,  i.e. 

F.(i)  *  R  for  all  i,  where  R  is  the  real  line;  e(z)  =  (1+L(z))*  c(z) 


(ill)  the  optimization  is  carried  over  one 
future  step  ahead  i.e.,  (T*l);  under  this 
condition  MAC  is  a  one-step  ahead  predictive 
controller . 

Under  these  simplifying  assumptions,  it  is 
sufficient  to  select  u*(t)  satisfying 

y  (t+1)  *  yr ( t+1 )  for  all  t  (1.6) 

for  a  minimum  of  the  cost  function  J.  The 
assumptions  (i)  -  (ii)  ensure  the  existence  of 
an  optimum  control  u*(t)  satisfying  (1.6).u*(t) 
is  then  implicitly  generated  by  v  (z)  =*  y^(z) 
so  that  P 


so  that  the  steady  state  error  due  to  a  step 
input  is 

e  ft)  *  Lim  (1+L  (z))-1  -  fl  +  L(l))'1  =  J 
SS  z-1 

which  is  a  consequence  of  a  builtin  inte¬ 
grator  in  the  compensator.  It  mav  be  noted 
that  using  the  set-up  of  Fig.  1  and  by  creat¬ 
ing  (1-a)  as  a  gain,  the  usual  classical 
root-locus  technique  can  be  applied  to 
analyze  the  behavior  of  the  closed-loop  poles 
as  a  changes  from  0  to  1.  To  make  the  root- 
locus  picture  complete,  the  characteristic 
polynomial  can  be  rearranged  with  a  modified 
gain  c  ■  \f  ( 1  —  y)  so  that  as  jt  changes  from 
0  to  1 ,  :  changes  from  0  to  infinity. 
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l-'X 

2 


2  (2.4) 


It  may  be  noted  from  Fig.  I  that  at  point  2, 
x(z)-y  (z)  when  h(z)  *  h(z),  where  v  (z)  is 
the  reference  signal.  This  shows  why  perfect 
tracking  is  possible  under  perfect 
identification.  We  will,  however,  not  pursue 
this  approach  here. 


L  (exp(  jo))) 


1-1 


cot 


and  L(exp(ju^,)  *  1.0  implies  the  unity 
gain  cross-over  frequency  at 


o 


2  sin 


-1  1-ot 

2 


(2.5) 


It  is  obvious  from  (1.8)  and  (2.1)  that  the 
closed-loop  system  is  internally  asymptotic¬ 
ally  stable  if  the  roots  of  the  rational 
function  * 

*cl(z)  =  (z-l)fi(z)  +  (l-a)h(z)  (2.2) 

are  within  the  open  unit  disk  z!<l,  and 
these  roots  are  also  the  roots  of  the  return 
difference  function  1  +  L(z).  We  can  there¬ 
fore  find  the  stability  margin  in  terms  of 
the  gain  margin  (GM)  and  phase  margin  (PM) 
from  the  Bode  plot  or  Nyquist  plot  of  the 
loop  transfer  function  L(z)  evaluated  on  the 
Nyquist  contour  z  -  exp(jut)  appropriately 
indented  around  the  poles  on  this  contour. 
Recall  that  in  continuous-time,  the  GM  and 
PM  are  those  values  of  k  and  b  respectively 
such  that  the  perturbed  loop  L(s)  * 
kexp(jD)L(s)  is  stable,  where  L(s)  is  the 
nominal  loop  and  s  is  the  Laplace  variable. 

A  similar  interpretation  goes  for  the  dis¬ 
crete-time  systems  (Kuo  (1980));  but  the  PM, 
unless  it  is  an  integral  value  of  the 
sampling  interval,  does  not  have  any  physical 
significance.  Strictly  speaking  the  complex 
constant  kexp(jb)  in  continuous  time  should 
be  replaced  by  kz“n,  n  an  integer,  for  measur¬ 
ing  GM  or  PM  of  the  discrete-time  system. 


The  Nyquist  plot  of  the  discrete-time  loop 
(2.4)  is  quite  simple  and  from  the  plot  it 
is  easy  to  see  that  the  system  Is  stable  for 
all  gain  e(0,  2/1-ot),  and  a  pure  delay 

b  *  90°  -  Sin-*  (l-a)/2  will  change  the 
number  of  encirclement  by  the  Nyquist  con¬ 
tour,  thus  making  the  system  unstable. 

To  get  the  equivalent  PM  we  transform  each 
element  of  the  loop  using  the  bilinear  trans¬ 
formation  s  *  (z-1) /(z+I) **  to  get  the 
equivalent  continuous  loop 

L(s)  -  (-  -1).  (2.6) 

i  3 

From  the  Nyquist  plot  of  L  (s)  it  is  obvious 
thatGM£(0,  2  / ( 1  — Dt)  (the  same  as  found  by 
analyzing  the  discrete-time  Nyquist  plot) 
and  a  PM  =  Cos-*  (l-Jt)/2. 

Theorem  1,  although  very  simple,  reveals 
some  intuitively  appealing  results  about  GM 
and  PM  of  MAC.  We  can  make  the  following 
remarks . 

Remarks : 

(1)  Since  ot£  [0,1],  the  guaranteed  upward 
CM  is  2  and  the  PM  is  60°  respectively. 


Another  way  to  compare  with  other  continuous¬ 
time  domain  design  techniques  is  that  each 
element  of  the  discrete-time  loop  should  be 
transformed  into  an  equivalent  continuious -t ime 
element  using  bilinear  transformation,  and  PM 
of  the  fictitious  continuous- t ime  loop  can 
be  taken  as  the  PM  of  the  discre te-time  loop. 

In  this  paper  the  word  PM  is  used  to  mean 
the  continuous-time  equivalent  phase  margin. 

We  can  now  state 

Theorem  l : 

Under  assumptions  (i)-(iii),  MAC  has 
GM  -  (0,  2  /  (  1-cO),  equivalent  PM*  Cos"^  l-ot) /2 , 
and  unity  gain  cross-over  frequency 
-  2  sin  -1  (i-ot)  J2. 

Proof :  The  proof  is  trivial  if  we  recall 

that  PM  and  GM  are  measured  on  a  nominal  loop. 
Here  we  can  assume  that  the  nominal  plant 
h(z)  *  h(z),  which  implies  h^  *  h^  and  N  *  N 
because  both  h(z)  and  h(z)  are  power  series 
in  z~* .  The  nominal  loop  transfer  function 
from  (2.1a)  is  then 

L(z)  =  -^7  (2.3) 

z-1 

i.e.  an  integrator  delayed  by  one-step. 
Evaluating  on  z  *  exp(j~),  we  get 


(ii)  We  can  always  trade-off  robustness 
against  the  speed  of  response.  As  response 
speed  is  increased  by  decreasing  a,  BW 

*  2  sin“l  (l-^)/2  increases  (which  makes 
sense)  with  a  consequent  reduction  of 
robustness  in  terms  of  GM  and  PM. 

(iii)  We  get  this  remarkable  PM  even  though 
MAC  is  an  out  put- feedback  controller  possibly 
because  the  plant  is  inverted  causally 
through  the  use  of  an  optimization  algorithm 
in  the  sense  that  at  each  time  the  algorithm 
provides  the  controller  with  the  entire 
future  input  sequence.  For  the  same  reason, 
the  discrete-time  loop  has  a  one  pole  roll¬ 
off  for  all  frequencies  -  which  is  rather 
unusual . 

(iv)  Theorem  1  ensures  that  the  controller 
can  stabilize  the  loop  for  all  the  plants 
ih.  belonging  to  the  set 

h1  h  *  kh  .  i*  1 , .  . . N ,  ke(o,  2/(l-:0)/. 

PLANT  ROBUSTNESS  ANALYSIS 

The  nominal  model  h(z)  is  usually  different 
from  the  actual  plant  h(z)  for  various 
reasons.  Sometimes  h(z)  Is  deliberately 
made  simple  to  facilitate  the  control  compu¬ 
tation  by  retaining  the  modes  in  the  active 
frequency  range.  On  many  occasions  it  is 


# 
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difficult  to  model  high  frequency  modes, 
and  these  are  simply  neglected.  Due  to  age¬ 
ing,  etc.,  the  modes  of  the  actual  plant 
drifts  slowly  thus  introducing  low-frequency 
error.  Thus  the  modelling  error  e(z)  has  in 
almost  every  case,  a  dynamic  structure;  and 
the  information  about  e(z)  must  be  incorpor¬ 
ated  in  designing  a  nominal  loop.  As  a 
minimum  amount  of  information  e(z)  is  expres¬ 
sed  as  an  upperbound  on  j e (exp(  j  <x>)) | ;  and 
the  purpose  of  robustness  analysis  is  to 
find  a  requirement  on  the  nominal  loop  in 
terms  of  this  upperbound  so  that  the  closed 
loop  performance  and  stability  is  maintained 
in  the  face  of  modelling  uncertainty. 

Usually  the  admissible  uncertainties  are 
expressed  in  two  ways:  additively  or  multi- 
plicativelv.  If  we  take  h(z)  as  the  nominal 
plant,  then  in  an  additively  uncertain  model, 
we  express  the  actual  plant  h(z)  as 


h(z)  *  h(z)  +  Ah  (z)) 
a 

(3.1) 

and 

in  a  mul t iplicat ively  uncertain 

model. 

the 

actual  plant  h(z)  is 

h(z)  -  h(z)(l  +  Ah  (z) 
m 

(3.2a) 

or 

h(z)  =  h(z)  Ah  (z) 

(3.2b) 

For  single-loop  systems  Che  order  of  multi¬ 
plication  in  (3.2)  is  irrelevant,  but  for 
MIMO  cases  che  order  is  important  because  of 
the  non-commutativity  of  matrices  where  inpuc 
channel  (left)  uncertainty  and  output-channel 
(right)  uncertainty  must  be  distinguished. 

Both  of  the  multiplicative  forms  in  (3.2) 
are  often  used  in  analysis,  but  in  this  paper 
we  shall  be  using  (3.2b).  Note  that  at  nom¬ 
inal  values  of  the  plant,  Ah  (z)  =  Ah  (z)  =*  0 

am 

and  Ah  (z)  =*  1.  Also  note  chat  the  classical 
m 

GM  and  PM  ensures  che  stability  of  a  perturb¬ 
ed  plant  of  the  form  (3.2b).  If  the  GM  is  k, 
then  Ah  (z)  ■  k,  and  if  the  PM  ■  n  (in  the 
sense  o¥  discrete-data  system),  Ah  (z)  *  z~n. 
These  are  undoubtedly  a  limited  class  of 
allowable  perturbations  and  we  must  consider 
other  possible  error-structures  in  designing 
the  nominal  loop.  The  framework  of  (3.1)  and 
(3.2)  is  more  general  in  the  sense  that  it 
can  handle  a  constant,  non-constant  and  even 
dynamic  model  mismatch  (say  for  example 
unmodelled  poles,  etc.).  Let  us  rewrite 
lii(z)  and  h(z)  as 

N 

h(z)  -  Z  n.z^*z^h(z)  (3.3a) 

1  P 


where  h  (z)  =1  h.z'  *  a  polynominal  in  z, 
p  i-i  1 

-N 

and  h(z)=z  h(z), 

P 


N  >:-  i 
h  (z)  =  -  h.  z 
p  i-i1 


(3.3b) 


Then  by  straight  forward  manipulation,  the 
closed  loop  characteristic  polynominal  is 


•t  (z)  -  (z-l)h  (z) 

cl,  p  P 

+  zN  (1-a)  h  (z)  o.i) 

P 


For  closed-loop  stability,  £cl,p(z)  must  have 
all  the  roots  strictly  inside  the  unit  disk 
izi*  1.  For  perfect  identification  N  =  N, 
hp{z)  -  hp( z)  ,  and  4>cl  (z)  -  zN ( z-Ct) hp( z)  . 

Of  course  the  zeros  of  n  (z)  and  z  will  be 
cancelled  eventually  leading  the  only  closed 
loop  pole  at  z  ■  a.  However  N,  the  order  of 
the  true  plant,  is  usually  unknown.  In  real- 
world  situations,  (3.4)  can  not  be  evaluated. 
The  actual  plant  h(z)  must  be  considered  as 
a  perturbation  of  the  nominal  plant  h(z) ,  and 
the  stability  conditions  must  be  derived  in 
terms  of  the  nominal  sequence  {h^  and  the 
perturbation  Ah  (z)  or  Ah  (z).  Let  us  assume 


Aha(z) 


that 
in  (3.3),  i . e  . 


and  Ahin(z)  can  be  expressed  as 


N 

a 

Ah  (z)  *  I  hz 
a  .  .  ai 


=*z^Ah  (z),Ah  (z) 
ap  ap 


a  polynomial  (3.4a) 
in  z 


Ah  (z) 

m 


N 

m 


i-l 


Ah  z  1 
mi 


z'Nm  h  (z) 
mp 


(3.4b) 


although  the  following  theorem  can  be  develop¬ 
ed  without  such  an  explicit  form.  Note  that 
the  index  in  (3.-*b)  must  start  from  0  to 
accomodate  constant  multiplicative  perturba¬ 
tion.  We  have  the  following  theorem  on 
robustness : 


Theorem  2: 


(i)  The  system  is  closed-loop  stable  for  all 
additive  perturbations  Ah&(z)  satisfying 


Ah  ( z) 
ap 


2- Cos   +  jc- 


1-, 


h  (z)  , 

P 


z  *  exp( j  4  (3.5a) 

(ii)  The  system  is  closed-loop  stable  for  all 
multiplicative  perturbations  Ah^(z)  satisfying 

(z)  -zNffl  "  (3.5b) 

np  1-u 


on  the  unit  circle,  where  Ah  (z)  and  Ah  (z) 

ap  mp 

are  given  by  (3.4). 


Proof:  The  proof  is  straightforward  if  we 

express  h(z)  using  the  form  (3.3)  -  (3.4), 
find  the  corresponding  closed-loop  characteris¬ 
tic  polynomial,  and  finally  use  Rouche's 
theorem  to  prove  (3.5)  on  the  assumption  that 
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the  nominal  loop  is  internally  stable  and  model  is  constructed  of  the  form 

hence  (z-i)  hp(z)  has  all  the  roots  strictly 

inside  the  unic  disk  z  *  1.  x  *  <t>x  +  Gu  +  w 

t+1  t  t  t 


The  tests  of  the  type  (3.5)  are  sufficient 
conditions  and  generally  tend  to  be 
conservative.  Nevertheless  we  can  make  the 
following  remarks: 

(i)  3oth  tests  (3.5a)  and  (3.5b)  are 

useful.  For  example  when  an  actual  known 

model  ;h^,  is  truncated  to  obtain 

h.t  i-1,.  N  <  N'-,  so  that  (Ah  .  -  h . , 

i  —  at  l 

i  =  N,  N  +  1,  .  .  ,N  and  Ah  .  -  0,  i<  BT}. 

ai  “ 

stability  around  ih.;  can  be  obtained  from 
( 3 . 5a; . 

( iiJ  For  constant  multiplicative  gain  mis¬ 
match.  i.e.  h,  =  kh,  for  all  i,  (Ah  .  *  k 
i  i  mi 

when  i*0  and  Ah  .  *  0  when  i>0),  so  that 
N  ml 

ih  (z)  -  kz  m  and  test  (3.5b)  yields  that 
mp 

the  system  is  stable  for  all  k  such  that 

k  -  I  <  — ^ — r~  ,  z  *  exp (j oo)  (3.6) 

But  it  is  easy  to  see  that  min  |exp(jto)-  i  |* 
l-:i  so  that  (3.6)  becomes  k-l  <  1  which 
implies  k£(0,2).  This  clearly  shows  that 
these  test  are  conservative.  (See  remarks 
(iv)  of  the  previous  section). 

CLOSED-LOOP  IDENTIFICATION 

The  results  of  identification  of  the  plant 
under  closed-loop  control  using  MAC  are 
described  in  this  section.  The  major  dif¬ 
ficulty  in  closed-loop  identification  is 
that  the  future  plant  inputs  are  correlated 
with  the  past  outputs  due  to  the  feedback, 
'tanv  identification  procedures  assume  the 
absence  of  si  ch  correlation,  and  produce 
biased  estimates  or  have  other  difficulties 
in  their  presence.  Maximum  likelihood  will 
handle  such  correlation,  but  can  be  computa¬ 
tionally  expensive  especially  if  not  prov¬ 
ided  with  good  initial  estimates  of  the 
parameters . 

For  identification  in  this  study,  the  can¬ 
onical  variate  analysis  method  was  used. 

This  approach  to  stochastic  realization  was 
first  proposed  by  Akaike  (1975).  A  recent 
generalization  (Larimore  (1983))  extends  the 
method  to  input-output  identification  in  the 
presence  of  noise.  The  method  is  based  upon 
a  decomposition  of  the  covariance  matrix  of 
the  past  p(t)  and  future  f(t)  of  the  plant 
input  process  u(t)  and  output  process  y(t) 
where 

?‘(t)  *  ru‘(t) ,  y‘(t),  uT(t-l),  yT(t-i), ...) 
f*ft)  =  fy'fc-l),  yT ( t+2 ) , . . . ) 


y  =  Hx  +  Au  +  Bw  +  v 
t  t  t  t 

where  w  and  v  are  white  noise  processed  chat 
are  independent  with  covariance  matrices  Q 
and  R  respectively.  These  white  noise 
processes  model  the  covariance  structure  of 
the  error  in  predicting  y  from  u.  Computa¬ 
tionally,  a  singular  value  decomposition  of 
the  sample  covariance  matrix  between  p(t) 
and  f(t)  is  used.  This  decomposition  is 
numerically  very  well  conditioned  and  stable. 

To  demonstrate  the  identification  algorithm, 
the  feedback  system  under  MAC  control  illus¬ 
trated  in  Fig.  2  was  considered  where  there 
is  input  white  noise  added  prior  to  observ¬ 
ing  the  plant  input  and  output  white  noise 
added  prior  to  observing  the  plant  output 
with  power  spectral  densities  and  Sq 

respectively.  The  particular  plant  consid¬ 
ered  is  the  very  lightly  damped  missile 
dynamics  model  (Mehra  et .  al . )  (1980) 


y  -  x  j 


where  the  states  are  -  <x  the  angle  of 
attack  (rad),  x?  =  P  the  perturbed  pitch  rate 
(rad/s),  input  Q  =  £a  the  elevator  angle 
(rad),  and  output  y  -  i  the  angle  of  attack 
(rad) .  An  analysis  of  the  dvnamics  gives  a 
natural  frequency  of  12. 24  r/s  (1.95  Hz)  and 
a  damping  ratio  ( Z)  of  0.061. 

The  canonical  variate  method  was  used  to 
identify  a  second-order  system  while  operat¬ 
ing  under  MAC  control  with  input  and  measure¬ 
ment  noise.  No  other  input  nor  change  in 
the  set  point  was  present,  and  the  system 
was  in  statistical  steady-state.  The  pre¬ 
sence  of  an  input  or  varying  set  point  would 
improve  the  ability  of  the  algorithms  to 
identify  the  plant.  The  plant  was  approx¬ 
imated  by  a  discrete  time  system  using  the 
exponential  transformation  at  a  sample  rate  of 
10  Hz.  This  was  used  for  the  actual  plant 
in  the  discrete  time  simulation,  and  in  the 
MAC  control  computations  the  discrete  time 
impulse  response  was  used  out  to  5  seconds 
and  set  to  zero  at  longer  times.  The  true 
and  identified  plant  models  are  shown  using 
sample  sizes  of  100  and  900  in  Fig.  2.  Note 
that  the  identified  plant  is  close  to  the 
true  even  for  a  substantial  amount  of 
measurement  noise. 


A  zanonical  variate  decomposition  of  the 
□variance  or  pyt)  ind  f ‘ t )  determines  the 
important  Mnear  como tnat ions  of  p(t)  for 
prediction  jz  : ' t >  .  From  this  a  state-space 
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ABSTRACT 


In  this  paper  the  mult lvartable  adaptive  control 
problem  is  addressed  using  the  Model  Algorithmic 
Control  (MAC)  method  in  conjunction  with  the  canonical 
variate  identification  method.  Under  some  simplifying 
assumptions  multivariable  MAC  is  shown  to  be  equivalent 
to  a  classical  controller  in  a  unit  feedback,  con¬ 
figuration.  Robustness  of  the  MAC  controller  against 
unmodelled  dynamics  is  assessed  by  perturbation  analy¬ 
sis.  The  canonical  variate  identification  method  is 
described  in  terms  of  choosing  a  state  of  a  given  order 
based  upon  past  information  to  optimally  predict  the 
future.  The  computation  is  a  noniterative  algebraic 
stochastic  realization  algorithm  that  involves  pri¬ 
marily  a  singular  value  decomposition  which  is  numeri¬ 
cally  very  stable  and  accurate.  The  canonical  variate 
method  is  shown  to  give  an  optimal  choice  of  instrumen¬ 
tal  variables,  and  simulation  results  show  it  to  be 
approximately  maximum  likelihood. 


1 .  MULTIVARIABLE  MAC  AS  A  CLASSICAL  CONTROLLER 

MAC  control  strategy  has  been  described  and  analyzed 
in  earlier  reports  and  publications  (Mehra  et  al,  1977, 
1979,  1980;  Mereau,  1978).  The  following  is  an 
extended  version  for  MIMO  plants. 

The  MAC  methodology  generates  a  control  sequence  by 
on-line  optimization  of  a  cost  functional,  and  the 
algorithm  is  suitable  for  implementation  on  micropro¬ 
cessors.  One  of  the  attractive  features  of  MAC  is  the 
clear  and  transparent  relationship  between  system  per¬ 
formance  and  various  design  parameters  embedded  in  the 
design  procedure.  We  assume  in  the  following  that  the 
input  sequence  u(n)  is  m-dimensional  and  the  output 
sequence  y(n)  is  p-dimensional .  There  are  five  basic 
elements  in  MAC: 

(i)  An  actual  stable  plant,  possibly  not  known 
exactly,  with  a  pulse  response  sequence  (Hn) , 

n-1,2 . N  where  each  1^  is  a  pxm  dimensional  matrix. 

(We  assume  for  simplicity  that  the  plant  has  no  time 
delay  element  and  is  purely  dynamic,  i.e.  it  has  no 
feedthrough  term).  Then  the  input  sequence  u(n)  and 
the  output  sequence  y(n),  are  related  by 

y(n)  -  Hju(n-l)  +  H2u(n-2)  +  ...  +  H^u(n-N)  (i.la) 

or,  Y(z)  -  H(z)U(z)  (1.1b) 

where  U(z),  Y(z)  and  H(z)  are  z-t ransf orms  of  y(n), 
u(n)  and  {Hn}  respectively.  Here 

H(z)  -  +  H2z"2  +  ...  +  Hnz-n  -  Hp(z)z“N 

1  This  work  was  supported  by  the  Air  Force  Wright 

Aeronautical  Laboratory  under  Contract  No. 
F33615-82-C-3600 


where  Hp(z)  is  a  pxm  dimensional  polynomial  matrix  In  z 
and  is  given  by 

Hp(z)  -  H1ZN-1  +  H2zn"2  +  ...  +  Hn  (L.lc) 

This  is  an  "all-zero"  model  and  Hp(z)  determines  the 
zeros  of  the  plant.  The  locations  of  non-minimum  phase 
zeros  impose  restrictions  on  the  achievable  performance 
of  MAC.  We  must  remind  the  reader  that  the  physical 
interpretation  of  a  zero  in  the  Impulse  response 
description  of  the  plant  is  different  from  that  of  a 
transmission  zero  in  a  rational  transfer  function  (RTF) 
model  (or  equivalently  difference  equation  (DE)  model) 
of  the  plant.  Also  the  physical  interpretation  of 
poles  of  a  RTF  model  as  natural  modes  of  a  plant  are 
lost  In  this  description. 


(ii)  An  internal  model  of  the  plant  having  the  same 
input-output  dimension  pxm  as  that  of  the  actual  plant 
and  the  pulse  response  sequence  {fln} ,  n-1,2,... ft.  The 
input  u(n)  is  the  same  as  that  to  the  actual  plant  and 
therefore  the  output  y(n)  of  the  model  is  given  by 


y(n)  -  flju(n-l)  +  fl2u(n~2)  +  ...  +  Rftu(n-ft)  (1.2a) 


or,  ?(z)  -  fl(z)U(z) 


( l -2b) 


where,  as  before, 

fl(z)  -  flp(z)  z_S  (1.2c) 

and  flp(z)  is  a  pxm  dimensional  polynomial  matrix.  {fln} 
is  generally  different  from  {Hn}. 


(iii)  A  p-dimensional  reference  trajectory  yr(n), 
preferably  smooth,  initialized  on  the  current  output  of 
the  actual  plant  y(n)  that  leads  y(n)  to  a  possibly 
time  varying  p-dimensional  set  point  c.  If  each  of  the 
reference  trajectories  yri(n)  has  a  first  order  dyna¬ 
mics  with  time  constant  leading  to  set  point  c^, 
i-I,2,...p  and  if  the  trajectories  do  not  interact 
with  each  other,  then  yr(n)  evolves  as 


yr(n+i)  -  Aa  yr(n)  +  (I-Aa)c,  yr(n)  -  y(n)  (1.3a) 
or,  zYr(z)  -  AaY(z)  +  (I-Aa)C(z)  ( l  .3b) 


where  Aa*  diag(a^)  and  C(z)  is  the  z-transform  of  c. 

(Iv)  A  closed  loop  prediction  scheme  for  predicting 
the  future  output  of  the  plant  according  to  the  scheme 

yp(n+l)  -  y(n+l)  +  y(n)  -  y(n)  (i.4a) 

or,  Yp(z)  -  ?(z)  +  z-l[Y(z)  -  ?(z)]  (1.4b) 

Here  yp(n)  is  also  p-diraensional. 
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To  see  that  Che  setup  la  Figure  l  Indeed  represents 
equation  (1.7),  note  that  at  point  l  we  have 


I 


(v)  A  quadratic  cost  functional  J  based  on  the 
error  between  yp(n)  and  yr(n)  over  a  finite  horizon  Tn 
(here  Tn  Is  an  Integer): 

T 

j  ,  ^  [  e^(n+k)W(  n+k)e( n+k)  O-^a) 

k-1 

+■  uT(n+k-l)R(n+k-l  )u(n+k-l)  1 


U(  z)  »  r1(z)(I-.\:i)E(z) 


-  fl“  l(z)(I-Aa)[C(z)-H(z)U(z)] 


-  Tr  yn[W(n+k)e(n+k)eT(n+k) 
k-1 

+  R(n+k-l)u(n+k~l)uT(n+k-l) ) 


(1.5b) 


where  W( • )  and  R( • )  are  positive  seraidefinite  time 
varying  weights  and  e(n+k)  -  yp(  n+k)-yr(  rH*k)  .  In  most 
MAC  applications  R( • )  is  set  to  be  zero. 

Given  (i)-(v),  MAC  finds  an  optimal  control 
sequence  (u*(n+i-l),  i-l,...Tnj  by  minimizing  J  over 
the  admissible  input  sequence  t  u( n+i-1 )  eQ( i ) ,  i-l .  .  .Tn  . 
Once  the  optimal  control  sequence  is  computed,  the 
first  element  of  the  sequence  is  applied  to  the  actual 
plant  and  the  process  repeats  all  over  again. 

In  general,  there  are  no  analytic  solutions  for 
the  control  sequence  ju*(n)}  -  it  is  computed  at  each 
step  using  an  algorithm  known  as  IDCOM.  Therefore  in 
its  most  general  form,  MAC  cannot  be  put  into  a 
classical  control  framework.  However  under  the 
following  simplifying  assumptions  MAC  can  be  modelled 
as  a  unit  feedback  configuration: 

(i)  The  actual  plant  H(z)  is  minimum  phase; 

(it)  The  plant  model  fi(z)  is  minimum  phase; 

(ill)  There  are  no  input  constraints,  i.e.  ft(l)-Rm 
for  all  i; 

(iv)  Tn-l,  i.e.  the  optimization  is  carried  over 
one  future  step  ahead*  Under  this  condition 
MAC  is  a  one-step  ahead  predictive 
controller. 

In  addition,  if  we  assume  that  the  plant  model  fl(z) 
is  exactly  known,  i.e.  ft(z)*H(z),  the  MAC  is  equivalent 
to  an  inverse  control  law.  However,  under  the 
simplifying  assumptions,  (i)-(iv),  it  is  sufficient  to 
select  u*(n)  to  satisfy 


y p ( n+ 1 )  *  yr(n+l)  for  all  n>0  (1.6) 

for  a  minimum  of  the  cost  function  J.  The  assumptions 
(i)-(iil)  ensure  the  existence  of  an  optimum  control 
u*( n)  that  satisfies  (1.6)  -  the  resulting  optimal  cost 
J*  is  zero  in  this  case.  However  U*(z)  is  then  impli¬ 
citly  generated  by  Yp(z)-yr(z)  so  that 

U*(z)  -  l(z-l)R(z)+(I-Aa)H(z)]“1[I-Aa]C(z)  (1.7a) 

Y(  z)  -  H(  z )  [  ( z-l )  R(  z )+( I-"Aa)H(z))~1[I_Aa]C(z)  (1.7b) 

Equations  (1.7a)  and  ( L . 7 b )  relate  the  setpoint  C(z) 
with  the  optimal  input  sequence  U  (z)  and  output 
sequence  Y(z).  It  Is  easy  to  see  that  this  simplified 
form  of  MAC  is  equivalent  to  the  following  MIMO  unit 
feedback  configuration  (we  have  dropped  henceforth  the 
the  *  superscript).  compensator 


Figure  1.  MAC  as  a  Classical  Controller 


Multiplying  both  sides  of  this  equation  by  (z-l)R(z) 
and  rearranging  we  have 

[  (  z- 1 )  fl(  z  )+( I-Aa)H(z) ]U(z)»( l-Aa)C(z) 

from  which  (1.7a)  and  (l.7b)  follow.  The  block  within 
the  dashed  line  in  Figure  l  can  be  thought  of  as  a 
dynamic  controller  of  the  classical  type.  The  loop 
transfer  function  at  the  plant  Input  (point  l)  is  given 
by 


L(z)>  -ip  r1(z)(I-Aa)H(z) 

z-1 


(1.8) 


and  determines  the  robustness  of  the  feedback  con¬ 
figuration  at  this  point.  When  we  have  perfect  iden¬ 
tification,  i.e.  H(z)  -  fl(z),  then  at  point  2 

0(z)  -  Y(z)  -  ~  (I-Aa)E(z) 
z-  L 

or  0(z)  -z-~-  (I-Aa)|C(z)-0(z)1 

or  zO(z)  =■  AaC(z)+(I-Aa)C(z)  (1.9) 

Equation  (1.9)  is  equivalent  to 

u(n+l )  -  Aau( n)+( I- Aa)c ,  u(n)  -  y(n) 

which  shows  that  u(n)  is  the  reference  trajectory 
sequence  yr(n)  as  shown  in  equation  (L.3a).  This  means 
that  when  the  plant  model  is  known  exactly,  the  control 
sequence  U(z)  is  generated  as 

U(z)  >  H_1 ( z) 0( z )  =  H_1(z)Yr(z)  (1.10a) 

Therefore  the  output  of  the  actual  plant  is 

Y(z)  -  H(z)U(z)  »  Yr(z)  (1.10b) 

which  shows  that  in  steady  state  the  plant  output  y(n) 
is  identical  to  the  reference  trajectory  yr(n)  -  per¬ 
fect  tracking  has  been  achieved.  Equation  (1.10a) 
clearly  shows  the  need  for  H(z)  to  be  minimum  phase. 

This  analysis  has  revealed  another  interesting  property 
of  MAC.  Exact  tracking  could  as  well  be  achieved  by 
inverting  the  plant  to  generate  the  sequence  u(z)  In  an 
open-loop  configuration,  but  In  MAC  it  does  so  in  a 
closed-loop  configuration.  Therefore  the  additional 
benefits  of  a  feedback  configuration  such  as  distur¬ 
bance  rejection,  sensitivity  reduction,  etc,  are  also 
obtained  while  slraultatneous ly  achieving  exact 
tracking. 

2.  ROBUSTNESS  ANALYSIS  OF  MIMO  MAC 

In  the  following  analysis  we  describe  MAC  using  a 
rational  transfer  function  or  difference  equation  (DE) 
model.  It  can  be  shown  that  the  simplified  MAC  using 
DE  description  of  the  dynamics  is  also  equivalent  to 
the  unit  feedback  configuration  in  Figure  1.  The 
advantage  of  using  this  description  is  that  the  robust¬ 
ness  of  the  closed  loop  can  be  examined  in  terms  of  the 
recently  developed  criteria  employing  the  loop  transfer 
function  and  return  difference  function  at  appropriate 
points  in  the  loop. 
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Let  us  analyze  the  Loop  transfer  function  L2(z)  at 
t  ho  output  of  thn  pl.mf,  I.e.  it-  point  2  .  Ln(z)  Is 
glvi?n  by 

L2(z)  *  H(z)tf  l(z)(I-Aa)  (2-l) 

Here  ft(z)  is  the  raodeL  of  the  actual  plant.  In  a  nomi¬ 
nal  design,  the  actual  plant  H(z)  is  assumed  to  be 
equal  to  the  model  ft(z),  i.e.  H(z)  *  ft(z).  However  the 
return  difference  function  H2(z)  is  then 

H^(z)  -  I+L2(z)  -  [(z-l)I+H(z)fl'1(z)(I-Aa)](2>2) 


has  a  fast  dynamics,  i.e.  a1  -«),  then  MAC  has  in  .ipwar  I 
Xaln  margin  of  h  dh  on  that  channel.  As  a  rn.it  tor  u| 
fact  6  db  upward  gain  margin  Is  a  guaranteed  one  for 
each  channel.  However  on  the  other  hand  by  slowing 
down  the  reference  trajectory,  i.e.  by  making  n^l.O, 
the  upward  gain  margin  at  each  channel  can  be  increased 
to  Infinity  -  which  is  an  unusual  result  for  an  output 
feedback  control  scheme. 

Using  the  above  analysis,  the  tolerence  of  the 
nominal  loop  to  any  perturbation  K  (not  necessarily 
diagonal)  can  be  obtained.  If  the  perturbation  is 
dynamic,  the  analysis  is  slightly  complicated  as  shown 
in  the  following. 


The  closed  loop  poles  are  given  by  the  zeros  of 
det(H2(z)).  For  the  nominal  loop,  i.e.  H(z)“R2(z)>  c^e 
closed  loop  poles  are  given  by  the  zeros  of 

det(zI-Aa)-0  (2,3) 

which  shows  that  the  plant  dynamics  are  cancelled  and 
the  overall  behavior  of  the  loop  is  governed  by  the 
reference  trajectory  dynamics  as  given  by  the  poles  <x\ , 
a2  • • • Op  In  this  case  each  of  the  p-outputs  is  iden¬ 
tical  to  the  corresponding  reference  trajectory  y  (n), 

1-1 ,  2,...p.  rl 


We  select  the  internal  model  ft(z)  and  therefore 
fl(z)  is  completely  known  to  us.  On  the  otherhand  the 
plant  H(z)  is  not  known  to  us  exactly.  It  is  customary 
to  think  that  the  actual  pLant  H(z)  Lies  in  a  neigh¬ 
borhood  of  fl(z).  If  we  define  this  neighborhood  by  an 

additive  perturbation  AH  (z)  that  satisfies 
a 

o(AHa(e^,iJ))<a([D)  ,  0<uj<2tt,  (2.3 

then  we  assume  that  H(z)  lies  among  the  class 

z )+AH^( z )  (2.9) 


This  property  of  MAC  is  also  obtained  if  we  com¬ 
pute  the  overall  transfer  function  T2(z)  from  the 
reference  trajectory  set  point  C(z)  to  the  output  Y(z), 
I.e.  Y(  z )  -  T2(z)C(z).  Since  I.  (z)  -  (l-A,t)/(  z-l)  we 
have 

T2(z)  "  L2(z)(I+L2(z))-1  -  (I-Aa)(zl-Aa)~l  (2.3a) 

Since  is  diagonal,  T2(z)  is  also  diagonal.  This 
shows  that  the  overall  transfer  function  is  non¬ 
interacting:  any  change  in  the  reference  trajectory 

parameter  in  the  i-th  input  channel  affects  the  output 
in  i-th  channel  only;  the  other  output  channels  are  not 
affected  at  all.  This  decoupling  property  of  MAC  has 
made  it  very  popular  in  industries  where  the  practicing 
engineer  always  prefers  a  decoupling  control  strategy. 
This  characteristic  of  MAC  as  an  output  feedback 
controller  is  outstanding. 

It  is  straightforward  to  compute  the  gain  margin 
from  (2.1)  if  we  recall  that  it  is  the  tolerence  of 
Che  nominal  Loop  to  a  multiplicative  perturbation.  In 
this  case  Che  gain  margin  is  given  by  Che  range  of 
values  of  a  diagonal  matrix  K-dlag(k^)  such  chat  Che 
perturbed  loop  remains  stable.  Here  Che  perturbed  loop 
L2p(z)  Is  given  by 

4p(z)  '  ™(z)jf '(z)(i-Aa)  -  -jL  KU-AJ  (2-4) 

The  closed  loop  poles  are  given  by 

det( (z-i) I+K-Aa)-0  (2.5) 


where  AHa(z)  is  given  by  (2.8).  Here  a  (X)  denotes  Che 
maximum  singular  value  of  X  and  a(u>)  is  a  frequency 
dependent  function  that  19  usually  known  to  a  designer 
from  his  aprlori  experience  with  the  system.  For  con¬ 
venience  of  the  analysis  we  assume  that  AHa(z)  is  ana¬ 
lytic  in  z  >1 .  To  analyze  The  worst  possible  case, 
suppose  the  actual  plant  H(z)  lies  on  the  boundary  of 
the  class  of  systems  given  in  (2.9),  i.e. 

H(z)  -  ft(z)+AH  (z) 
a 

The  perturbed  loop  L^^(z)  is  then 

L2p(z)=*  !  fl(z)+AHa(z)  ]  fl  l(z)(I-A„) 

-  [I+AH^zJffLz))  [I-A0] 

The  closed  loop  poles  of  the  perturbed  loop  is  given  by 
the  zeros  of 

det( I+L2p(z) )-0  (2.12) 

and  we  want  to  find  some  condition  on  ft(z)  such  that 
the  zeros  in  (2.12)  lie  within  the  unit  circle  |z|<l. 
The  nominal  Loop  transfer  function  L2(z)  and  return 
difference'  function  H2(z)  are  obtained  from  (2.l)as 

L^.-^a-A,)  (2-lJa) 

H2(z)-I+L2(z)  -  (zI-Aa)  (2.13b) 


(2.10) 

(2.11) 


If  Aa  is  diagonal  and  if  z^  are  the  roots  of  equation 
(2.5),  then  the  closed  loop  poles  z ^  are  given  by 

Zi+ki~l~k:iai“0  or  zi”kiai+1‘~k:i  (2.6) 

Clearly  the  perturbed  loop  is  stable  if  for  all  kt , 
|zjJ<1,  This  immediately  implies  that 


0<k  < 


2 


(2.7) 


Clearly  the  perturbed  loop  transfer  function 
L2p(z)  can  be  considered  as  a  perturbation  AL2(z)  of 
the  nominal  loop  transfer  function  L2(z)  where 

AL  (z)-  -i-[AH  (z)fl_1(z)]  [I-A  )  (2, 

4  z-- 1  a  (x 

and  ( z)-L_ ( z)  +  AL„( z) . 

Zp  2  2 

Equation  (2.12)  then  reduces  to 


which  agrees  with  the  SISO  results  obtained  earlier. 
This  is  a  satisfactory  result  since  MAC  is  an  output- 
feedback  controller  and  not  a  s tate-f eedback  one. 
Equation  (2.7)  shows  that  if  any  reference  trajectory 


det(I+L2(z)+AL2(z))-0.  (2.15) 

The  nominal  Loop  is  stable,  and,  aa  we  have  seen 
earlier.  It  has  p  number  of  rlosod  loop  poles  In 
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|rj  l.O.  Then  a  sufficient  condition  that 
det(  £+L.2p(  z) )  in  (2.15)  has  also  p  number  of  zeros  in 
|  2  |< l .0  Is  (Sain  (1981)) 

5(AL2(eJ,i,))<a(l+L2(ej“))  (2-16) 

where  a(X)  denotes  the  smallest  singular  value  of  X. 

If  AL2Tz)  satisfies  (2.16),  the  perturbed  loop  is 
stable.  Equation  (2.16)  can  further  be  simplified  as 
follows.  First  note  that  d(mX) Ca(mY)  implies 
c(X)<o_( Y).  Then  using  (2.14),  and  (2.13)  in  (2.16), 
we  have 

5  [aHa(eju,)R“1(e:1“)(I-/\a)]<;£(eja)I-ACI)  (2.17) 

which  is  implied  by 

J[AHa(eJu’)]a[!t(ej'J,)15[I-A0ll<a(eJ“l-Aa)  (2.18) 

Let  amjn  -  min  aj  where  each  aj.  Is  such  that 
Then  1 


with  '/( t  J  a  measurement  noise  and  w<  t )  j  process  noise 
with  respective  cross  spectral  density  matrlcies  R  and 
Q.  From  the  theory  of  Harkov  processes  and  in  par¬ 
ticular  the  theory  of  stochastic  realization,  the  mini¬ 
mal  state  vector  defines  the  information  from  the  past 
relevant  to  the  future  of  the  process  and  is  called  the 
predictor  space  (Akaike,  1974a). 

The  approach  of  canonical  variables  for  system 
identification  is  to  determine  the  optimal  set  of 
linear  combinations  m(t)  of  the  past  p(t)  that  best 
predict  the  future  f(t)  in  terras  of  minimizing  the  pre¬ 
diction  error 

E|  |f  -  i  II2  -  E[(f  -  f)T  Eff  (f  -  £)]  0.3) 

where  £ff  Is  the  covariance  matrix  of  the  future  f  and 
f  is  the  best  prediction  of  f  based  upon  the  memory 
ra(t).  This  optimization  problem  involves  the  optimal 
selection  of  the  dimension  of  m(t)  as  well  an  the  opti¬ 
mal  selection  of  the  linear  combinations  of  the  past. 


o  l I-Art ] " l-a  , 
a  min 

Equation  (2.18)  is  satisfied  if 

o[R(eju))]  <  a(“)>Q~>in)_ 
o(eJ"I-,\a) 


(2.19) 


(2. 19) 


The  RHS  of  (2.19)  is  precoraputable .  If  the  identified 
model  ft(z)  satisfies  (2.19),  then  the  MAC  control  law 
is  stable  for  all  plants  under  the  class  given  by 
(2.8).  However,  we  are  still  looking  at  the  physical 
interpretation  of  the  condition  given  in  (2.19).  For 
SISO  systems,  the  singular  value  is  repLaced  by  the 
magnitude  function. 

Similiar  relations  can  be  drived  for  multiplica¬ 
tive  perturbations  and  for  modelling  uncertainties  at 
the  plant  Input. 


3.  SYSTEM  IDENTIFICATION  HSING  CANONICAL  VARIABLES 


Proposed  methods  for  multivariable  parameter  iden¬ 
tification  are  plagued  with  problems  of  computational 
complexity  and  unreliability.  For  iterative  optimiza¬ 
tion  approaches  such  as  maximum  likelihood,  there  is  no 
apriori  bound  on  the  number  of  iterations  required  for 
convergence  unless  a  good  initial  estimate  is 
available.  The  computations  involved  in  many  schemes 
become  lllconditioned  if  the  parameter  ident if  lability 
is  lllconditioned  which  occurs  frequently  in  practice. 
The  canonical  variate  method  involves  solution  of  an 
algebraic  problem  involving  primarily  a  singular  value 
decomposition  which  is  numerically  accurate  and  stable 
for  any  set  of  data.  The  system  is  identified  in  an 
implicit  state  space  form  which  avoids  the  iden- 
tifiability  problem. 

The  approach  to  system  identification  using 
generalized  canonical  variables  is  described  in  some 
detail  in  Laritnore  (  1983b).  That  approach  involves 


consideration  of  the  past  p(t)  and  future  f(t)  of  a 
vector  process  at  a  time  t  defined  as 

pT(t)  -  (yT(t),  ,iT<t).  yT(  t-l) ,  uT(t-l),  ...  )  (3.1) 

fT(C)  -  (yT(  t+1 )  ,  yT(t+2),  ...  )  (3.2) 

where  u(t)  is  the  input  and  y(t)  is  the  output  of  an 
unknown  system  with  state  space  structure  of  the  form 

x(t+l)  -  <t>x(t)  -  Gu(t)  +  w(t)  (3.3) 

y(t)  -  Hx(t)  +  Au( t )  +  Bw( t)  +  v(t)  (3.4) 


The  problem  of  minimizing  (4.5)  is  precisely  a 
generalization  of  the  classical  canonical  correlation 
analysis  problem  of  mathematical  statistics  (Hotelling, 
1936).  Modern  computational  procedures  use  a  genera¬ 
lized  singular  value  decomposition  (SVD)  (Golub,  1969) 
involving  the  covariance  matrlcies  of  the  past  and 
future.  The  generalized  SVD  determines  transformations 
J  and  L  and  a  diagonal  matrix  D  such  that 

J  £pf  L  -  Diag(-ri>...>7h>°.***.°)  *  D 

J  2pp  J  -  I  ;  L  Iff  L  -  I  (3.7) 

The  transformations  can  be  interpreted  as  defining  a 
new  set  of  coordinates  for  the  past  and  future  in  which 
the  covariances  are  D,  I,  and  I  respectively  as  given 
in  the  last  equations  (3.6)  and  (3.7).  For  a  full 
order  state  model,  the  optimal  memory  or  state  x(t)  is 
related  to  the  past  p(t)  in  terms  of  the  first  h  cano¬ 
nical  variables  as  ra(t)  ■  (I,0)Jp(t),  i.e.  the  first  h 
components  of  the  canonical  predictor  variables  Jp(t). 

A  minimal  order  realization  is  obtained  with  this 
choice  of  state.  The  computation  of  the  state  space 
matrlcies  is  given  in  Lartraore  (1983b). 

In  system  identification,  the  covariance  matrlcies 
are  not  known  but  are  estimated  from  the  observations. 
The  statistical  determination  of  rank  in  the  canonical 
variate  analysis  is  given  approximately  using  standard 
canonical  correlation  analysis  methods  (Akalke,  1976). 

A  more  refined  comparison  between  the  different  order 
models  is  given  by  use  of  the  Akaike  information  cri¬ 
terion  (AIC)  which  is  asymptotically  optimal  in  mini¬ 
mizing  entropy  (Shibata,  1981).  The  use  of  entropy 
measures  such  as  the  AIC  has  a  fundamental  justifica¬ 
tion  in  terms  of  the  basic  statistical  principles  of 
sufficiency  and  repeated  sampling  (Lariraore,  1983a). 

The  minimal  order  realization  can  be  determined 
from  the  canonical  correlation  analysis  with  Ic-h. 
However  with  k<h  when  a  reduced  memory  is  selected,  the 
approximate  system  does  not  in  general  minimize  the 
prediction  error  for  that  order.  This  is  because  the 
reduced  rank  canonical  variables  are  not  in  general 
recursively  computable.  However  in  the  case  of  the 
statistical  rank  determination  problem,  there  Is  an 
Insignificant  difference  between  the  state  of  the 
realized  system  corresponding  to  the  statistically 
optimum  choice  of  order  and  the  full  rank  canonical 
variables . 

The  instrumental  variables  method  has  a  natural 
interpretation  in  terms  of  the  generalized  canonical 
variate  problem.  In  the  instrumental  variables 
approach,  the  state  equations  (3.3)  are  considered  as 
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unobserved  structural  relationships  that  are  indirectly 
observed  through  the  noisy  measurement  equations  (3.4). 

A  vector  ra(t)  of  instrumental  variables  Is  constructed 
which  is  hopefully  close  to  the  true  state  x(t).  This 
is  used  in  place  of  the  true  state  in  solving  the 
problem.  This  apparently  works  well  for  an  appropriate 
choice  of  the  instrumental  variables. 

<c  •  ' 

A  more  general  problem  is  the  optimal  choice  of 
instrumental  variables  as  posed  by  Rao(1965,  1979). 

This  is  formulated  as  finding  the  optimal  choice  of  k 
linear  combinations  of  the  past  p(t)  that  predict  the 
future  f(t)  as  measured  in  terms  of  the  squared  error 
(f  -  £)T(f  -  f) .  This  is  precisely  the  generalized 
canonical  variate  problem  (Larimore,  1983b)  with 
weighting  matrix  0-1.  if  k  is  chosen  as  full  rank, 
then  the  memory  and  the  state  space  realization  are 
independent  of  the  weighting  matrix  0*1  replacing  Jiff 
in  (343)  and  (3.7)  (Larimore,  1983b).  However,  for 
lower  rank  k<h ,  there  can  be  a  considerable  difference 
between  the  state  space  and  reduced  order  system  for 
different  weightings  0  (Larimore,  1983b).  The  squared 
error  reiates  to  energy  while  the  canonical  correlation 
analysis  relates  to  the  statistical  significance  of  the 
problem.  The  canonical  correlation  analysts  can  be 
viewed  as  an  optimal  choice  of  the  instrumental 
variables  using  the  appropriate  weighting  (3.5)  of  the 
prediction  errors  for  the  determination  of  the  sta¬ 
tistically  significant  number  of  states. 

Time  recursive  methods  using  instrumental 
variables  and  approximate  maximum  likelihood  (IV-AML) 
are  claimed  to  be  approximate Ly  efficient  parameter 
identification  methods  for  large  samples  as  shown  in 
simulation  examples  (Young,  1979).  This  is  shown  by 
Monti  Carlo  simulation  and  by  estimating  the  parameter 
by  Monte  Carlo  simulation  that  the  canonical  correla¬ 
tion  method  also  gives  efficient  identification  of  the 
system  dynamics.  This  is  done  by  evaluating  the 
spectral  estimation  error. 

4 .  EFFICIENCY  OF  CANONICAL  CORRELATION  ANALYSIS 

The  asymptotic  efficiency  of  system  identification 
using  canonical  correlation  analysis  is  discussed  in 
this  section.  An  entropy  measure  of  the  error  between 
the  true  and  identified  system  is  used  to  measure  Che 
error  in  estimating  the  spectrum. 

To  directly  describe  errors  in  the  identified 
system,  a  recently  developed  entropy  measure  of  the 
system  identification  errors  involving  the  power 
spectrum  is  used.  The  entropy  measure  is  a  fundamental 
measure  of  the  error  in  approximating  a  system  using  a 
model  selection  procedure  that  may  include  the  choice 
of  model  order  such  as  state  apace  dimension.  In  a 
predictive  inference  setting,  the  entropy  measure 
follows  naturally  from  the  fundamental  statistical 
principles  of  sufficiency  and  repeated  sampling 
(Larimore ,1983a). 

Consider  a  vector  stationary  Gaussian  process 
. . . ,y(-l) ,y(0) ty( 1) , . • . ,  with  power  cross-spectral  den¬ 
sity  matrix  S(m),  and  suppose  that  some  parameter  esti¬ 
mation  or  model  fitting  scheme  is  used  to  choose  a 
model  S(u>)  based  upon  a  sample  of  N  time  observations. 
The  negative  entropy  per  unit  time,  or  negentropy  for 
brevity,  for  measuring  the  error  between  the  true 
spectrum  S(w)  and  the  model  selection  procedure  which 
estimates  the  spectrum  $(u>)  can  be  expressed  as 


N(S,S)-E-i  ^(log  S(.)S-Uii)  (1.1) 

+  cr[  I  -  S(<d)S“'('j>)]  }  4“ 

L  TT 

■  e  — •  /  tHs-kuHSN)  -  S(  j)]}2 

4  -IT 

where  expectation  is  taken  with  respect  to  the  para¬ 
meter  estimates,  and  the  approximation  holds  to  second 
order  in  the  elements  of  S-S.  The  last  expression  is  a 
generalization  of  the  Integrated  squared  relative  error 
in  estimating  the  power  spectrum  S,  and  there  is  an 
interpretation  in  the  multivariable  case  in  terras  of 
the  principle  components  of  the  power  cross-spectral 
matrix  (Larimore , 1984 ) . 

In  the  case  of  ML  estimation  of  the  parameters  9, 
the  estimates  are  asymptotically  consistent  and  effi¬ 
cient  achieving  the  Craraer-Rao  lower  bound  E(9-8)(3-3)^ 
Using  this  lower  bound,  the  lower  bound  on  the 
entropy  measure  of  spectral  accuracy  is  derived  as 
E{N(S,S) f  <  k/2N.  This  implies  the  lower  bound 
(Larimore,  1982,  1984) 

E  /  tr{S-l(u)[S(u>)  -  S(w))}2  (4.2) 

on  the  expected  integral  of  the  relative  squared  error. 
This  is  a  fundamental  bound  on  the  achievable  accuracy 
in  spectral  estimation. 

To  demonstrate  the  efficiency  of  the  canonical 
variate  method  of  system  identification  relative  to 
MLE ,  the  spectral  accuracy  of  the  method  was  compared 
with  the  lower  bound  (4.2).  The  autoregressive  moving 
average  (ARMA)  process 

y(t)  -  1.3136  y(t-l)  -  1.4401  y(t-2)  +  1.0919  y(t-3) 

-  0.83527  y( t-4)  +  v(t)  +  0.17921  w(t-l) 

+  0.82020  w( t-2)  +  0.26764  w(t-3)  (4.3) 

of  order  (4,3)  respectively  for  the  AR  and  MA  parts 
with  the  noise  variance  of  w  as  Q  *  1  .72581 E—  2  was  used 
to  simulate  samples  of  size  N-800.  This  process  was 
analyzed  by  Gersch  and  Sharp  (1973)  and  Akaike  (1974b) 
to  show  the  increased  accuracy  of  ARMA  models  over  AR 
models.  The  canonical  variate  analysis  was  done  on 
sample  covariance  raatricies  involving  16  lags  of  the 
past  and  future. 

Figure  2(a)  shows  the  power  spectrum  of  the  true 
and  estimated  models  for  6  Monte  Carlo  trials  of  N-800 
samples  each.  The  estimated  spectrum  appears  to  be 
close  to  the  true  spectrum  with  a  small  bias  at  the 
peaks  and  troughs.  Figure  2(b)  gives  the  squared  rela¬ 
tive  error  of  the  variability,  excluding  the  bias,  in 
estimating  the  power  spectrum  at  each  frequency  along 
with  the  lower  bound  for  the  expected  squared  relative 
error.  The  average  of  the  errors  over  the  6  Monte 
Carlo  trials  is  very  close  to  the  lower  bound 
demonstrating  the  relative  efficiency  of  the  canonical 
variate  method. 

In  Larimore  et  al  (  1983),  the  ident if  teat  ion  of  a 
very  lightly  damped  plant  under  closed-loop  control 
using  MAC  is  simulated  in  Monte  Carlo  trials.  The  ade¬ 
quacy  of  the  identified  model  is  demonstrated  by  com¬ 
paring  the  fitted  Impulse  response  and  transfer 
function  with  the  known  plant  dynamics. 
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Figure  2.  Variability  of  Monte  Carlo  Simulations  in 
Terras  of  (a)  Power  Spectral  Density  and 
(b)  Squared  Relative  Error  for  Simulations 
and  Lower  Bound  Average 
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density  matrices  and  Fourier  coefficients.  For  simplicity  the  time  series  case  with  < 
a  scalar  is  developed  below,  however  the  results  generalize  easily  to  the  random 
field  case  of  a  vector  r  .  Then  asymptotically  the  log  likelihood  function  is  given 
following  Whittle  (1953)  and  Larimore  (1977)  with  CM  =  F («)-*(<•)  “<*  usin8 
the  relationship  AQ  (to)X  (to)  =  0  by 

log  p(x  ,0)  =  — ^-log2Tr  -  y  / [log|  Sqq(u)\  +  Q  {u)Sqq{<»)Q (0))1'^' 

*  ^  —Tl 

and  the  elements  of  the  gradient  vector  3logp/d6  and  Fisher  information  matrix 
F  (6)  are 

.  »r  *  ,  dSaa l40) 

ilSHL  =  -4/„[{f-S;,'(w)Q(<v)e,(w))S«'(»)-|— 

39/  2  t  * 


-X(»X2»S,>>) 


dH  (to)  i  d  to 
dQi  J  2tt 


Fij(e)  =  -e{ 


i^logp 

t/vj  Uv y 


> 


-  2 

— TT 


-1/ 

(<*>) - 


ae,- 


{5^(0)) 


(to) 


aey 


+  s«(“) 


a// (to)  jj 


dQj 


2tt 


(i) 


2.  SIMULTANEOUS  CONFIDENCE  BANDS 

Let  yO'  be  a  variable  such  as  frequency  or  time,  and  consider  a  p -dimensional 
complex  vector  /  (y,9)  with  components  that  are  functions  of  y  and  «  having  con¬ 
tinuous  second  derivatives  with  respect  to  the  parameters  9.  For  example,  the  ele¬ 
ments  of  the  vector  function  /  (y,9)  could  be  the  elements  of  the  spectral  matrix  S, 
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the  squared  magnitude  coherences,  the  impulse  response  functions  of  a  spectral  fac¬ 
tor,  or  the  covariance  functions  of  the  process.  Asymptotically 

/  (7,6)  -  /  (7,0)  =  /  o(7,0)(0  -  0)  (2) 

where  /  9(7,0)  denotes  the  matrix  of  partials  df  (y,©)/^  evaluated  at  0  =  0.  This 
expansion  and  the  Scheffe'  method  (Scheffe',  1953,  1959,  p.68-70)  of  simultaneous 
confidence  intervals  as  applied  in  Newton  &  Pagano  (1984)  lead  to  simultaneous 
confidence  bands  in  the  univariate  case.  For  multivariate  processes,  it  is  of  consid¬ 
erable  interest  to  extend  these  results  to  simultaneous  confidence  bands  on  vector 
and  matrix  functions  of  the  parameters,  e.g.  the  spectral  matrix.  The  extension  that 
we  will  consider  is  the  quadratic  form 

if  (7,0)  -  /  (7,0)}*^  (nW  (7,0)  -  /  (7,0)} 

which  will  be  bounded  as  a  function  of  y.  In  the  multivariate  case,  there  is  a 
choice  to  be  made  for  P  .  For  reasons  of  invariance  and  to  obtain  an  equally  tight 
confidence  bound  on  any  linear  combination  of  /  (y,0)  -  /  (7,0),  P  is  naturally 
chosen  as  the  inverse  of  the  covariance  of  (2). 

In  the  sequel,  a  general  P  is  used  and  then  specialized  to  this  natural  choice. 
The  basic  mathematical  result  needed  for  such  an  extension  is  given  in  the  Appen¬ 
dix  and  is  used  to  prove  the  following  theorem  on  simultaneous  confidence  inter¬ 
vals. 

Theorem  1.  Consider  a  parametric  family  of  stationary  Gaussian  vector 
processes  with  power  cross-spectral  density  matrices  S(y,0)  for  060  satisfying  regu¬ 
larity  conditions  (Whittle,  1953),  and  for  which  the  parameters  are  locally  identifi¬ 
able  so  that  the  Fisher  information  matrix  F(0)  as  given  by  (1)  is  full  rank.  Let 
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y(l),  y(2),...,y(N)  be  a  sample  realization  and  0  be  an  asymptotically  normal  and 
efficient  estimator  of  0.  Let  P(y,0)  be  a  Hermitian  matrix.  Then  as  N  -  <»,  the 
probability  is  at  least  1  -  a  that  simultaneously  for  all  y€T  the  true  p  -vector  func¬ 
tion  /  (y,0)  is  bounded  by 

if  (7,0)  -  /  (7,0)}V  (yW  (7,0)  -  /  (7,0)} 

<XlAtr  f  e(y,0)  F_1(0)  fl(y,Q)P (7,0) 

where  q  is  the  dimension  of  the  vector  0  and  where  X^q  is  the  upper  a  critical 
point  of  the  chi-squared  distribution  on  q  degrees  of  freedom. 

Proof:  As  shown  by  Rothenberg  (1971),  the  parameters  are  locally  identifiable 
if  and  only  if  the  Fisher  information  is  full  rank.  Let  f  (y)  and  /  (y)  denote  /  (y,0) 
evaluated  at  0  and  0  respectively.  The  vector  random  variable  Nm{f  (y)  -/  (y)}  is 
asymptotically  distributed  as  the  normal  random  vector  N  mf  e(y,0)(0  -  0).  Asymp¬ 
totically  (0  -  O)7 F  (0)(0  -  0)  is  a  random  variable,  where  F(0)  is  proportional  to 
sample  size  TV  as  in  (1).  So  the  probability  is  1  -  a  that  the  true  0  satisfies 
(0  -  Of  M  (0  -  0)  <  1  where  M  -  F(0)/X^  .  From  the  Appendix,  this  inequality  is 
satisfied  if  and  only  if  ||  H  (0  -  0)||  2  ^  trHM~lH*  for  all  p  X4 -dimensional  matrices 
H .  Since  the  set  {H  =  P'U2( y,0)f  o(y,0)  for  y€T}  is  possibly  a  proper  subset  of  all 
pXq  -dimensional  matrices  H ,  it  follows  that  asymptotically  with  probability  at  least 
1  -  a  the  inequality 


N{f  (y)  -  /  {y)}*P(y,Hf  (7)  -  /  (7)} 

=  N{f  e(7,0)(0  -  0)}*F(7,0){/‘  e(7,0)(0  ~  «)} 
<  NXl^  tr  f  0(y,0)  5(y,0)P(y,0) 


(3) 


is  satisfied  simultaneously  for  all  7 ST. 


For  the  natural  choice  of  P  =  {f  o(y,0)  F  l(Q)f  0(7,6)}*,  using  t  to  denote  the 
pseudoinverse  of  the  covariance  of  (2),  the  inequality  (3)  becomes 


if  (7) "  /  e("/>9)  F~\Q)  f  S(7,0)}V(7)  -  /  (7)}  *  rX 


2 


where  r  =  Rank(P). 

The  relative  squared  spectral  error  /r[S-1(cD){S(a))  -  5(ai)}]2  is  a  fundamental 


quantity  in  measuring  the  accuracy  of  a  spectral  estimation  procedure.  The  integral 
of  this  quantity  is  asymptotically  the  Kullback-Leibler  information  of  negative 
entropy  (Larimore,  1983)  which  is  a  fundamental  statistical  measure  of  model 
approximation  error.  The  expected  value  of  the  integral  is  proportional  to  the 
number  of  estimated  parameters  divided  by  the  sample  size  (Larimore,  1982).  From 
Theorem  1,  simultaneous  confidence  bands  on  the  sample  relative  squared  spectral 
error  are  given  by  the  following  theorem. 

Theorem  2.  Under  the  conditions  of  Theorem  1,  as  N  -  the  probability  is 
at  least  1  -  a  that  simultaneously  for  all  toSH  the  sample  squared  relative  spectral 
error  is  bounded  as 


tr[S  VojX-S  (to)  -  S  (a))}]2 


=  XltqE  tr  [S-1(<o){5  (oj)  -  S  (to)}]2 


(4) 


where  {gkl( 6)}  —  G  —  F  ^e). 
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Proof:  Asymptotically  S(co)  and  5  (to)  are  equal  so  that  we  may  consider  its 
inverse  in  (4)  a  constant  denoted  5  (to).  To  apply  Theorem  1,  we  consider  the  Her- 
mitian  matrix  A(co)  =  5_1/2(cd){S  (a))  -  ,S(co)},S_1/2(co)  and  express  the  squared  relative 
error  symmetrically  as 

tr  [5_1(co){5  (o>)  -  5  (co)}]2  -  tr  [S_1/2(co){S  (co)  -  S  (a>)}5_1/2(a))]2 

=  tr  AA  =  tr  AA*  =  J  afy  a*  =  /*(u>y  (co)  (5) 

ij 

where  /  (to)  =  vecA  (co)  is  a  vector  containing  the  elements  of  the  matrix  A  (to). 
Application  of  Theorem  1  to  the  vector  function  /  (to)  and  rearrangement  as  in  (5) 
proves  the  inequality.  Expanding  S  (co,0)  as  in  (5),  the  equality  follows  from 

E  tr  [5-1(to){5  (co)  -  5  (to)}]2 

=  tr  X  E(0  -  0)(0  -  Of  S'W)^ 

*  j  °Vk  <™l 

and  using  E  (0  -  0)(0  -  Gf  =  F-1  from  the  asymptotic  efficiency  of  0. 

In  principle  any  quadratic  form  in  the  components  of  the  spectral  matrix  could 
be  used  as  in  Theorem  1  by  introducing  a  weighting  matrix  P  (<o,0).  For  confidence 
intervals  on  the  spectral  matrix,  the  weighting  of  the  inverse  covariance  of  the 
error  in  estimating  the  spectral  matrix  gives  tightest  confidence  bands  which  can  be 
expressed  as 

vec*{S (co)  -  S (co)KS-^^-  gkl (0) } W {S (co)  -  S (co)}  <  X 2W 
For  a  given  confidence  level  a,  this  gives  a  simultaneous  confidence  band  for  all 
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frequencies  to €11  as  a  quadratic  form  in  the  elements  of  S(co)  -  S  (to). 
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APPENDIX:  AN  ELLIPSOIDAL  INEQUALITY  LEMMA 

Lemma  1.  Let  *  be  a  real  q -dimensional  vector  and  for  a  particular  p  let  H 
be  the  class  of  p  -dimensional  complex  matrices  H ,  and  let  Af  be  a  symmetric 
positive  definite  matrix.  Then  tj>  satisfies  4 »rM*  <  1  if  and  only  if  for  every  H  (H 
the  following  inequality  holds 

||//4>||2<tr  HM~lH * 

Proof:  Any  Hermitian  matrix  A  has  an  eigenvalue-eigenvector  expansion 

A  =  XXntXmxm 

m 

From  the  Schwartz  inequality,  (tji'x)2  =s  for  any  *  and  x  with  equality 

if  and  only  if  *  =  cx  for  c  a  scalar.  Let  B  be  such  that  BB*  =  M_1.  Setting 
=  fl-1*  and  denoting  the  quantity  in  braces  {  }  by  A ,  we  have  for  every  H  €ff 
and  every  <J> 

||  2  =  ||  #0*11  2  = 

=  *  A  *  =  4»*  2  *m 

m  m 
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<  i|»>  2  xm  (*«  *«)  =  <t>*Af<|>  =  4>" Af  4>  trHM~lH * 

m 

From  this  the  "only  if"  part  of  the  lemma  follows,  and  choosing 
H *  =  £*-1(iJ>,0, ...  ,0)  gives  A  =  ijnji*  and  strict  equality  which  implies  the  "if”  part 
of  the  lemma. 
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